Sunday, January 20, 2013

Free Statistical Machine Learning Books and Lecture Slides

Statistical Machine Learning - Data Mining - Business Intelligence Books.

Inspired by recent David Smith's blog post about the possibility to downloading free Statistical Machine Learning e-book, I try to "re-echo" and "re-package" the exciting news.  Thanks to the generous researchers who make their valuable works freely available, we can download free digital copy of their books (either whole or as subsections) and the accompanying lecture slides.  Herewith, I list the URLs for downloading some data mining/ predictive analytic/ statistical machine learning resources

1.  "Elements of Statistical Learning (Full Book)" by  Trevor Hastie, Robert Tibshirani, and Jerome Friedman.

2.  "Information Theory, Inference, and Learning Algorithm (Full Book)" by David MacKay.

3.  "Forecasting: Principles and Practices (Full Book)" by Rob J Hyndman and George Athanasopoulos.

4.  "Introduction to Data Mining (3 Chapters of the Full Book and Complete Lecture Slides)" by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar.

5.  "Machine Learning (Lecture Slides)" by Tom Mitchell.

6.  "Pattern Recognition and Machine Learning (one Sample Chapter and Some Lecture Slides)" by Christopher Bishop.

7.  "Data Mining for Business Intelligence (Instructor Materials)" by Galit Shmuelli, Nitin Patel, and Peter Bruce.

(the following URLs do not really fits in the data analytics topic, but they're useful information to share anyway:D).

8.  "Multiagent Systems - Algorithmic, Game-Theoric, and Logical Foundations (Full Book)" by Yoav Shoham and Kevin Leyton Brown.

9.  "Fundamentals of Multiagent Systems with NetLogo Examples" by Jose M Vidal.

10.  "Essential of Metaheuristics (Full Book)" by Sean Luke.


Hopefully the URLs will still work in the future,  Enjoy then.. :)

Saturday, January 19, 2013

Knowing Extra Data Analytic Tools can be Handy

In the last period, I have been using R for preparing and analyzing big sized data.  It has been a reliable tool for many of my needs.  The existence of numerous blogs (e.g. r-bloggers) that share many useful tips makes trouble shooting can be done reasonably fast.

Memory Allocation Notification - R

Not without flaw, I just realized that R does have its limit.  At the moment, it can not allocate working memory more than 2.2 GB regardless how powerful your computer is (see the description picture above).  Although you have 8 GB RAM, R will send an exception when the utilized memory reaches the allowed limit.  To solve this issue (i.e. coping with "big data" analysis using R) Ryan Rosario's video can stand as a good reference.

EM Clustering - Weka
I haven't really experimented with what the video has suggested (I am not in the mood for learning new package :p).  In exchange, for my problem at the time (doing EM clustering for ~25,000 records data), I just used my old "fellow" Weka, and solve the problem within 30 minutes or so.

Because a tool's limitation can be explored at unexpected time, it is always handy to know more than one statistical learning tools (e.g. R, Weka, Rapidminer, etc).  Happy data crunching then :).  

Network Plot with Gephi - My 1st Trial

After quite sometime from my initial blog about Gephi, finally I took some time to learn using it.  Actually, for few days I've been trying to find some available network plot packages that can be developed within R environment.  However I can't find one that can produce appealing visualization better than Gephi (at least for now).  For those of you who'd love to learn making network plots using R, I do recommend Dai Shizuka's blog for clear starter tutorials.  For those who want to learn Gephi, it will take you about one or two days to make descent graphs.  Gephi is quite easy to learn for a beginner.  In principle, I only need to read two guides (i.e. the quick start guide and the supported input files) to start.  For making a try out plot (below) I spent most of the time in preparing the edge list input file.  So give some time to try it yourself then... :)     

My First Gephi Plot.

Saturday, January 5, 2013

Schiphol Airport's Luggage Logistics

When I was clearing up some old magazines and papers,  I realized an interesting information was hiding in one of the magazines.   I ripped some of the pages and scanned them to "conserve" the insight about how the Schiphol Airport (one of the world's busiest airports) handle an important issue of any traveler, the luggage business (can you imagine if you arrive in another continent without finding your luggage at the picking belt? surely it will be disastrous!!!).   Investing one or two minutes to understand how they handle our precious bags won't hurt.   Anyway its better to keep the insight in this blog rather than junk it directly to the trash can for sure.


The information source is the Schiphol Magazine - September 2011,  I think I intentionally brought the magazine with me along the traveling to scan the luggage operation description.   After more than a year, I did scan them today..  better late than never huh.. :p

Click on the picture for better viewing and enjoy!!!

Luggage Logistics Operation - Page 1
Luggage Logistics Operation - Page 2
Luggage Logistics Operation - Page 3
Luggage Logistics Operation - Page 4




Tuesday, January 1, 2013

2012 Berganti 2013: Kliping Foto Kalender Belanda


2012 beranjak pergi, 2013 menghampiri tak terhindari..  semoga dalam setiap periode waktu, entah itu detik baru, menit baru, jam baru, hari baru, minggu baru, bulan baru, ataupun tahun baru, kita selalu belajar dan berada dalam derajat kebaikan yang lebih bagus dari masa-masa yang telah lampau.. (amin :p)

Kelender 2012

Seru-seruan nge-scan kalender 2012 yang dikasih gratis dari yang punya kontrakan, sayang juga kalau harus berakhir di tempat sampah gambar-gambar bagusnya.  Seru juga untuk ngehias blog di awal tahun ini :D

Negeri Belanda