2018-2019 exchange Winter 2019 Newsletter FINAL | Page 15

– 49 versus 50-50, well then you need more samples. What’s the type of signal you are trying to detect will depend on how much data you need. Also, how good the data is. Oftentimes in business you want to solve a quick problem, so you don’t need that much. A lot of times you want to create some com- petitive advantage, we want to find some data that helps us be a lit- tle better than everybody else. And for that, having a large data set and having additional input data that helps us predict something a lit- tle better is crucial. Jen: Are there certain types of data that would not interest you? That is, data you don’t see any value in having—for example, data that is 50 years old. Glenn: I do get that question a lot. There are two kinds of data that I would not consider valuable. One is data that is no longer relevant or was never relevant. From a data age perspective, it depends on how quickly the phenomenon that you are trying to predict changes. For life insurance mortality data, we easily look at data that is 25, 30 years old, and it is still quite relevant. This is because, luckily, death is a rare phenomenon and the major source of what people die from hasn’t really changed over the past few decades; it’s changed a little bit, but not dramatically. That data would be valuable. If you are look- ing at who is going to buy the next iPhone and you want to do a mar- keting campaign, your 10-year old data would be quite useless be- cause the market for smart phones has dramatically changed over the past 10 years. Data is also not helpful if it got corrupted or can- not be used by regulation. Jen: Is there data on certain media that you would not use because it would be cost prohibitive? For example, historical data on microfilm. Glenn: Cost certainly does matter and sometimes data can be hard to get to if it’s on microfiche and it’s more than a couple decades old or if it’s handwritten on paper. For underwriting ap- plication, it might be useful but for a marketing application it probably is too expensive. So, one does have to look at the cost of digitizing the data and making it useful versus the value it actually has for that particular application. 15