asb: head /dev/brain > /dev/www

My home, musings, and wanderings on the world wide web.

R: Converting a data.table to a multi way array (cube)

This post discusses the problem of converting a data.table to the array data structure in R. The idea is analogous to converting a denormalized dataset that presents both dimensions and facts in a table as columns of the table to a completely normalized fact cube along the dimensions of the dataset.

The problem can be solved in multiple ways in R with attending constraints of these approaches – e.g. plyr::daply, xtabs, by or a manual home-brewn set of split and lapply routine. Without discussing the constraints I observed with the existing techniques, I am presenting an alternative approach here that depends on unrolling the rectangular data structure into a linear structure and then reshaping it by manually counting the facts and dimension sizes (think strides). The choice of using a data.table was purely for efficiency reasons but the same idea can be implemented with a data.frame with little changes to the code.

Knuth-Fisher-Yates shuffling algorithm

So, yesterday, I was sent this article by a friend. Articles titled X every developer should know sort of titles pique my interest very much. So, I headed over to this article only to find myself too confused. What I found more confusing is that the article links to one of Jeff Atwood’s blogposts that discusses the same concept.

The central idea of the post is to correct a naive implementation of a shuffling algorithm. My confusion arose from the fact that after reading the article, I could only think that the algorithm was blatantly incorrect and not naive. The differences between the two algorithms – the naive one and the one ascribed to Knuth, Fisher, & Yates is what, in sampling 101, is between sampling with replacement and sampling without replacement. This quote essentially highlights the naivete indeed:

How do we know that that above algorithm is biased? On the surface it seems reasonable, and certainly does some shuffling of the items.

Some shuffling does not mean a random permutation at all. This is akin to saying a naive implementation of a queue sometimes inserts incoming items at the head of the queue.

Another shuffling algorithm?

On the other hand, what was fruitful was that the article reminded me of a shuffling algorithm I recently wrote to solve problem number 25 of 99 Lisp Problems. The minor difference between the two algorithms is that the lisp algorithm pops an element from the original array per turn and collects it in a new array which becomes a random permutation (shuffle) of the original array at the end. The KFY algorithm instead swaps two elements at each turn. It can be worked out that, mathematically, the two algorithms are identical. But the KFY algorithm saves you making a copy. Therefore, yes, KFY is an algorithm that every programmer should know. Additionally, every programmer should also know the standard library of their language.

Editing markdown in Vim and previewing in Firefox.

Lately, I have taken to maintaining most of my text in Markdown (desperately trying to avoid Emacs’ org-mode) and previewing it while editing in Vim is something that is a nice to have. So I searched a little bit and found this nice little snippet by Ellen Gummesson that I then adapted per my preferences as such:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
" Preview markdown in Firefox

function! PreviewMarkdown()
  let outFile = expand('%:r') . '.html'
  silent execute '!cd %:p:h'
  silent execute '!python -m markdown % >' . outFile
  " The screen will need to be redrawn. Dunno why! :\
  silent execute 'redraw!'
endfunction

augroup markdown
    au!
    au BufNewFile,BufRead *.md,*.markdown setlocal filetype=ghmarkdown
    au BufNewFile,BufRead *.md,*.markdown setlocal textwidth=0
    autocmd FileType ghmarkdown map <LocalLeader>p
          \ :call PreviewMarkdown()<CR>
augroup END

Machines that make music and steal jobs

I stumbled upon two different videos today while surfing the web both of which impressed me deeply. It was only later that I connected the dots between the two. The first is a breathtaking example of what human ingenuity and creativity can achieve with machines; the other a luddite damnation of the same phenomenon. I will put these here and refrain from any comments.

Digitopoly provides a counter-arguments to the conclusions of the second video which I, personally, found unconvincing in their depth of analysis.