Forever Learning

Forever learning and helping machines do the same.

Archive for July 2010

the Same Old Song

leave a comment »

[I’ve tweeted about this before.]

A few months ago my friend and neighbor Olav was fiddling around with a dataset of movie plot descriptions he downloaded from the Internet Movie Database (IMDb). If I recall correctly, he was taking a stab at the Netflix Prize. We discussed this for a while over coffee, but (as usual) our conversations were all over the place; and somewhere along the line we wondered what songs are used most often in movies.

Play

Play!

What is that song they always play? The one that goes like ‘#dun dun dun dun dudun dun dun duuuuun#‘. You know?

The IMDb site offers lots of different datasets for download, and we quickly found that one of them contains soundtrack listings (the aptly named file soundtracks.list.gz). Now it was just a matter of filtering out the unnecessary contextual data and counting songs. Quickly Olav, who does datamining for a living, managed to get all this done using spiffy point-and-click tools. I proceeded to ask twitter what people thought the answer would be.

The top five results turned out to be a collection of classics. The songs played in movies (according to the IMDb data) is as follows.

  1. “Jingle Bells” (220x)
  2. “William Tell Overture” (204x)
  3. “Home Sweet Home” (160x)
  4. “Auld Lang Syne” (149x)
  5. “Rock-a-Bye Baby” (140x)

Not at all what we were expecting, but quite obvious when you think about how many Christmas movies are out there. Data mining is very often like that. You find answers that were unexpected, but also unsurprisingly obvious.

It’s the same song, but it never gets old.

[Much later, a friend (can’t remember exactly who) noted that the song that is played most often in theaters is probably not listed in the data set the IMDb provides. It’s the 20th Century Fox intro.]

Written by Lukas Vermeer

July 16, 2010 at 16:53

Posted in Datamining

the Trick to 42

with 4 comments

Frank Buytendijk, in his book Performance Leadership, writes the following.

Most people I asked, and most sources I referred to, define an organization similarly as “a group of people that share the same goals and objectives”.

[…] Working with this definition of an organization, leads you to think that stakeholders all share a set of central goals and objectives, and can be aligned in this direction. In reality, nothing could be further from the truth. In fact, many of the goals and objectives live at odds with one another. Shareholders want the highest possible shareholder value; employees look for job security and a place to build their skills and make a career; customers want a good price and a decent product or service; and suppliers want to sell as much as they can.

He subsequently proposes an alternative definition.

I have adopted what I think is a better definition of what constitutes an organization: An organization is a unique collaboration of stakeholders for the purpose of realizing goals they could not achieve by themselves. The trick to performance management is not to align everyone to the same goals and objectives, but in finding ways to bridge conflicting goals and objectives.

Shooting Range

Each his own Target

In my view, Frank is being modest here, this in not just the trick to performance management; it is the trick (but not the answer) to life, the universe and everything. Once you realize that not everyone wants the same things you do, the world gains quite a few interesting dimensions.

So don’t try to align everyone to your goals and objectives, but find ways to bridge the gaps and resolve conflicts. There is no need for everyone to agree to want the same thing, if we can find a solution where everyone can have what he or she wants.

Please, feel free to disagree with me in the comments; I’m sure we can work things out, and learn a thing or two on the way. 🙂

Written by Lukas Vermeer

July 10, 2010 at 15:27

Posted in BI, Oracle

Copying Bits in the Fastlane

with 5 comments

I’ve finally gotten ’round to rooting my ‘old’ G1 phone and installing the Cyanogen firmware. I will not bore you with the technical details of flashing firmware, but I would like to share an observation. To illustrate my point, I will need to cite some of the steps involved.

The technicalities of the following citations do not matter all that much, so if you want, you can skip right to the short translation. There will be no pop-quiz at the end, I promise.

Rooting the G1 is not as complicated as you might expect from the enormous amount of steps involved. The biggest obstacle for me proved to be the fact that I do not have a copy of Windows installed on any of my computers, so the following steps in the “the unlockr” guide (referred to by none other than Cyanogen himself) were practically impossible for me (underlining is mine; and as promised, there is a short translation below).

9. Now, goto http://download.cnet.com/HxD-Hex-Editor/3000-2352-10891068.html?part=dl-HxDHexEdi&subj=uo&tag=button to download the HxD Hex Editor. Save it and install it to your computer.

10. Take your SD card out of your phone and put it into the SD adapter it came with. Then put that into your computer so it shows up on your computer as Removable Disk.

11. Open the Hex Editor (Run as Administrator if one Vista or Windows 7) and click on the Extra tab, then click on Open Disk. Under Physical Disk select Removable Disk (your SD card you just put into the computer). Make sure to UNcheck “Open as ReadOnly”. Click OK.

12. Goto the Extra tab again and click Open Disk Image. Open up the goldcard.img that you saved from your email. You should now have two tabs, one is the SD card (Removable Disk) and the other is the goldcard.img. Press OK when prompted for Sector Size 512 (Hard Disks/Floppy Disks).

13. Click on the Goldcard.img tab and click on the Edit tab and click Select All. Then click on the Edit tab again and click Copy.

14. Click on the Removable Disk tab (Your SD Card) and select offset 00000000 to 00000170 then click on the Edit tab and click Paste Write.

15. Click on File then click Save.

Translation: download software of dubious origin, run software as Administrator, open disk, click a few things, copy/paste a few bits, save. Apparently all we want to do here is copy some bits from a file to (the beginning of) a disk. Copying a few bits can’t be this hard, right?

I mean, this is a computer; it copies bits all the time!

Luckily I was not the only Android enthusiast stumped by these Windows-only (and frankly, rather convoluted, manual  and error-prone) steps. Someone else had already solved the problem for me (thanks, Sven). The steps in his guide (for Mac or Linux) seems a lot simpler to me (again, underlining mine and translation below).

1 .Open your Mac’s Terminal under Applications -> Utilities ->Terminal (Or your Linux-Terminal)

2. Enter diskutil list and confirm

3. You should be able to see your SD-Card now. You can recognize it from its size (mine is 2GB) and that its type is DOS_FAT_32. As IDENTIFIER it says disk2s1.Remember this identifier.

4. Now unmount the card the folowing way: If your identifier was disk2s1 enter diskutil unmountDisk /dev/disk2 and confirm. If not, you have to replace the 2 with your value (the value that is written right after the word disk).

5. Now create your goldcard with sudo dd bs=512 if=~/goldcard.img of=/dev/disk2. If you need to, replace the 2 again. Confirm, wait, enter your users password (or under linux your roots password) and confirm again.

Translation: figure out which disk you need, copy bits to disk. The dd command used here is available in every flavor of *nix I’ve ever encountered. Let me break that crucial fifth step down for you.

sudo I am the Administrator, do the following as I say! (normal users are not allowed copy bits this way, that would be a Very Bad Idea)

dd Copy bits

bs=512 in chunks of 512 bits (“bs” stands for “block size”, in this case, not cow manure)

if=~/goldcard.img from this file (“if” for “in file” and “~/” just means the file is in my home folder)

of=/dev/disk2 to this disk (“of” for “out file” and “/dev” is the Device File folder, where *nix pretends devices attached are just files e.g. disk2 for my SD-Card).

It could be me, but that looks a lot simpler (and faster and less error-prone) to me than the Windows instructions.

The problem seems to be that, although Windows is perceived to be easy to use (and it probably is, up to some point), the lack of real and raw power under-the-hood can make anything as trivial as copying a few bits impossible without resorting to downloading dubious applications. It’s like having a car that is pretty and easy to drive, as long as you stay out of the fastlane.

And that, my friends, is one of the reasons why people like me do not like Windows very much. We nerds, we like living life in the fastlane.

Written by Lukas Vermeer

July 3, 2010 at 17:44

Posted in Android, Bash, Linux

%d bloggers like this: