Archives for posts with tag: OLS

Shaw Communications (SJR) has segments that include Wireline customers for businesses and consumers and Wireless customers for prepaid and postpaid plans. The  company has recognized around 1.3  billion (CAD) for recent quarters, of which around 300 million (CAD) is earned from the Wireless segment.

An interesting pattern emerges from the customer segments. Shaw is losing Wireline customers but gaining Wireless customers, and most of the decline in the Wireline segments are from consumers rather than business customers. What does it mean for expectations of future revenue if a company is losing customers in the segments that make up around three quarters of their revenue and gaining customers in the segments that make up around a quarter of their revenue? To attempt to answer this question, I ended up:

Read the rest of this entry »

Friday night I gave a talk to business school students about machine learning. The goal of the talk was to put some context around the topic using detailed examples. The talk began with exploratory data analysis, examining summary statistics, and checking the dataset for erroneous observations (e.g., negative prices). The dataset used contains housing prices and the characteristics of each house- size, age, etc. I also included two irrelevant variables: final grades from my undergraduate auditing courses and a randomly generated variable.

Read the rest of this entry »

Scaling or deflating variables is common in accounting and finance research. It is often done to mitigate heteroskedasticity or the influence of firm size on parameter estimates. However, using analytic results and Monte Carlo simulations we show that common forms of scaling induce substantial spurious correlation via biased parameter estimates. Researchers are typically better off dealing with both heteroskedasticity and the influence of large firms using techniques other than scaling.

The full paper is here:

This post contains an example which shows why a degree of freedom is lost each time a regressor is added to an OLS model. The OLS first order conditions, and thinking about OLS as a series of partial derivatives which minimize the sum of squared residuals, are the foundations behind the posted Stata code.

Read the rest of this entry »