If you spot the bug in the following piece of code, never mind reading further. Otherwise, go on.
1 2 3 |
|
If you spot the bug in the following piece of code, never mind reading further. Otherwise, go on.
1 2 3 |
|
Hadley Wickham, with Romain Francois and Dirk Eddelbuettel, has released a new package called dplyr. Once again Wickham has redesigned an important component of what one does with R. The package is designed specifically for rectangular datasets which may live in memory or databases. With the new package, Hadley also convincingly closes the efficiency gap between plyr and data.table. This new efficiency comes largely from the various C++ injections into the performance critical branches of code in dplyr
(thanks to Romain and Eddelbuettel).
On the other hand data.table
, a mature and popular project, started by Matthew Dowle has been the go-to for all performance-centric R programmers for some time now. While dplyr
raises a serious contention to data.table
’s claim to fame, both data.table
and Dowle are old hands at such competition. There has long been (very healthy) competition between Hadley’s plyr
, Dowle’s data.table
, and Wes McKinney’s pandas
(a data munging library for Python).
In this post I add another data point to the set of benchmarks of the two packages. For the official take, see this and this.
R has had a perverse but technically valid model(s) of Object Oriented
Programming (OOP) for a long time. ReferenceClasses, introduced by
John Chambers are the new promise of traditional OOP in R. This post
implements a stack (& queue) in R as an exercise in ReferenceClasses
based
OOP.
I’ve used textConnection
in R for reading from strings. Only recently, I realized (duh!) that it may also be used for write functionality much like Python’s stringIO module. In this post I show some simple examples for how to use this functionality.