The Importance of a factor depends on how importance is defined
By using multiple regression, we explore the extent to which each variable contributes to the prediction of the criterion. The word ”importance” has a fuzzy nature open to different interpretations and the importance of a factor depends on how importance is defined. By variable importance, some people mean theoretical importance that refers to the change in the dependent variable based on a change in the predictor variable that can be measured using the regression coefficient. Other people take importance to mean the increase in the score of a dependent variable measured by the unstandardised regression coefficient. This interpretation is popular in economics. Finally, another group of people take importance to mean dispersion importance which is popular in behavioural science and refers to the amount of variance of the dependent variable explained by the regression equation that is attributable to each predictor variable.
Relative Importance and Digital marketing
Digital analysts are constantly asked to answer the question: What are the most important marketing channels our company should focus on? Determining the true importance of each marketing channel is called attribution modelling. You can read more about attribution modelling here, here and here. Attribution modelling has seen a lot of interest in measuring relative importance in recent years. Variable importance in regression refers to the quantification of an individual regressor’s contribution to multiple regression models. In this article, we shed light on the black box model of Google 360 by investigating different regression models, such as the Shapley value regression and pmvd ( a newly proposed model by Fledman) and several other models. In addition to multivariate linear regression, some other models such as random forest have received a lot of attention recently for assessment of variable importance.
In an associated study, relative importance is defined as percentage contribution. The simplest measure of importance is the zero-order correlation in which importance is defined as the direct predictive ability of the predictor's variable. In order to have a big picture of the importance of channels, you can start by visualising them in terms of their correlations to the conversion.
Correlation gives us the overall picture. However, if we define importance as the extent of the predictive ability of each factor in conjunction with other factors, the correlation is lacking and fails to consider the effect of each predictor in the context of other predictors. We need to measure the relative importance using regression to know what is changing in the expected value by changing the predicted variable.
R has a package for calculating relative importance. Ulrike Grömping has written an R package called relaimp that calculates relative importance. A description of the package is available in the Journal of Statistical Software.
Using this package, we can build a model that measures the relative importance of the factors with different metrics. In the following example, we measure the relative importance of five channels on conversions with 7 metrics.
The above metrics are elaborately explained here and here and I'm not going to go into more details in this post. We explain how to make these metrics more actionable and how to decide which one is more suitable in a particular case. Moreover, we should notice the concept of relative importance comes from experimental design where we are able to piece together components any way we want. In digital marketing, normally, the predictors have a relevant and known order. This fact helps us to pick the most appropriate metrics.
In marketing, everyone is looking for budget optimisation, but before we can do that we need to know the importance of the channel that the budget is spent on. Relative importance methods help us to measure the percentage contribution of our channels and allocate budget to them according to their importance. As we discussed above, in the next posts we will go into the details of different metrics and provide some tricks for selecting the most relevant metric in a particular case.