I have often needed to do this: split a factor variable into a set of dummies. For example, when randomForest will not let you use a factor with more than 32 levels, you may want to take this route. Or if you are like me and don’t like to rely too much on R’s formula
interface it may help you choose your base category for a factor or supress intercept easily.
Mixed thoughts about Stack Overflow
I used to love stackoverflow when it was a read-only site for me. Since I have started participating to answer questions I have been having some mixed feelings about the site. Here are my thoughts in no relevant order.
R: Deprecate logical subsetting?
Since I discovered it, I have found logical subsetting in R a very elegant idiom. Recently I’ve had a change of heart due to two observations. Here is why I propose staying away from it.
R: Recreating the history of a stock index's membership
If you work in finance, have you ever needed to identify the stocks that were in an index at any given date? Or perhaps a series of dates? Given how central indices are to financial markets and how frequently they change, this is a common problem.