Tuesday, April 3, 2012

Links of interest

In my class we recently discussed some of the reasons why there is an explosion of data. The WSJ reported on a couple of reasons that maybe we had not thought of:

Employees measuring their productivity


Using customer information in real time  


I found this video a while back that is along the lines of people tracking their own data; it also does a fair job explaining the ideas of correlation vs causation:

Anecdotal Evidence: Bullying edition

The Wall Street Journal recently ran an article saying the recent attention given to bullying in schools is largely driven by a handful of large profile cases that have led to suicides, but not supported by any data. The article points out that data collected on bullying does not show an increase recently; in fact the opposite is true. This article is a good example of how the media (and all of us) can often be misled by anecdotal evidence into believing a claim without looking to see if the anecdotes are typical... Making decisions based on outliers might not be appropriate and a closer look at what is actually going on is likely going to lead to a more informed decision.

Thursday, March 22, 2012

Stats from recordings: case study in basketball

The Wall Street Journal reported on a new business that will take video of a high school and college  basketball game and perform some impressive indexing and data pulling in order to generate statistics for coaches to use in improving their team.  Here's a clip from CNBC (beware, there is likely a commercial at the beginning: 






The dissection of a game into parts that can be recorded, tracked, and analyzed is one of the basic premises of data analysis.  Without the data, coaches are left to manage based on anecdotal evidence, or with past lure (as dipicted in Moneyball).  But once data has been collected and information gleaned from it, coaches are able to see the effects of their decisions and individual players abilities.  From the WSJ article:

"For coaches, Krossover's results can be rather shocking. For years, Tammy Lusinger, head coach of the girls' basketball team at Mansfield Summit High School in Arlington, Texas, had a favorite play called "Bama" in which the girls cleared out one side of the court then set up a series of screens to free a player for an open shot. "It's a great looking little play," Lusinger said last week.

"But after she started using Krossover, she was in for a surprise: The numbers showed that they were only scoring on that play 5% of the time. Earlier this month, Lusinger and Mansfield Summit won their second Texas state championship in the last four years."

The difficulty lies in the data collection part.  Krossover is able to harness cheap, but capable labor to scrutinize every tape one at a time.  Businesses with thousands of operations may not have the luxury of the same capabilities; however, I'm not sure why they could not.  For example, a manager of a McDonald's could record the interactions of cashiers and have them indexed, sliced, and diced the same as a basketball coach.  This might bring up images of a "Big Brother"  firm that records and analyzes every movement of it's employees.  On the other hand, it may bring to light the differences between stellar employees and just good employees.

Friday, March 16, 2012

Survival rates of heart patients differ depending on marital status

Wall Street Journal reported today on a study about survival rates of heart patients:
In a study of nearly 600 heart-surgery patients, unmarried people were nearly twice as likely to die as married people within five years of the procedures. For married patients, the survival rate was roughly 85%; for the unmarried, 70%


I'm not sure how "nearly twice as likely to die" was calculated given the numbers:
25% of married patients die or 1 in 4, so twice as likely would be 2 in 4 of 50%.
Instead unmarried patients die at a rate of 30% or 1 in 3, not exactly "nearly". There is likely another calculation that is not jumping out at me.

Also, the bigger more revealing part of the story is here:
Much of the long-term difference could be explained by unmarried patients being more likely to smoke (a behavior that spouses can influence, naturally).

So the headline could have read "People who smoke are more likely to die than those who don't after heart surgery (oh, and people who smoke also have a higher chance of being single). But, then no one would read it ad it wouldn't make a heart warming (pun intended) story about how marriage can literally save your heart (of which I full heatedly (it's too easy) agree).

What would it mean to your business if a certain segment of customers was found to be less receptive to your offering and by less receptive I mean will be more likely to die?

Tuesday, June 14, 2011

Big Data and Music Industry





The economist recently reported that the music industry has started using some concepts of data analysis to better understand their customers. A couple of my favorite lines include:

Every time a track is uploaded to or played on YouTube, every time it is sold by iTunes, streamed on Spotify, shared on a pirate network, liked on Facebook or tweeted about, it gives off a digital signal. A cottage industry has sprung up to process these signals and feed the results to the record companies.

“We’re no longer just a wholesaler of music,” says Paul Smernicki of Universal. As their traditional business declines, the music companies are moving into live music and merchandise. To succeed in those markets, they will need to become experts in fan behaviour, understanding not just how and why people buy music but how they weave it into their lives. In that, data will be crucial.








Tuesday, March 1, 2011

Descriptive, Predictive, & Prescriptive analytics

The Institute for Operations Research and Management Science (INFORMS) has recently gone through a process of rethinking what "Business Intelligence" or "Business Analytics" means.  They have settled on what seems to be a couple of frame works (here is the first, here is the second).  But, first their definition of analytics:
Analytics facilitates realization of business objectives through reporting of data to analyze trends, creating predictive models for forecasting and optimizing business processes for enhanced performance.
Next, they identified three main categories of analytics:

1. Descriptive - the use of data to find out what happened in the past (I would add:  what is happening now)
    - data modeling, trend reporting, regression analysis
2. Predictive - the use of date to find out what could happen in the future
     - data mining, predictive modeling
3. Prescriptive - the use of data to prescribe the best course of action for the future
     - optimization, simulation
Here is a short clip from IBM (hat tip The Vantage Point) discussing the three categories:



The second framework similarly identifies three different uses or users of business intelligence as described in their venn diagram:


I think neither framework is bad, but I worry that there seems to be some level of hierarchy among the three elements, that one is vastly superior to the others.  

Here is another interesting post on the subject.

Monday, February 14, 2011

Vegas provides more than just bad odds....

Economic Downturn hits wedding business in Vegas?
 Fewer than 92,000 couples married in or around Sin City in 2010. The last time the city married fewer people, it was 1993.
 ...in 2004, when 128,250 couples tied the knot. Fewer people said "I do" in each subsequent year.
 Is it due to poor economy or decline in marriages?  How would you determine a sense of causality or at least determine the effect of poor economy vs shift in social norms?