Unsupervised Habit Learning and Destination Perdicting Algorithm



I developed a user habit learning and activity perdicting algorithm using clustering and Baysian Network for my Master's Thesis in WPI at 2015 adviced by Professor Taskin Padir. The algorithm can predict the destination the user want to visit based on the user's past activity such as the time when the user entered the building, when the user visited the office and the time of the day.

This algorithm can assist people like stephen hawking who can only do very low bandwidth communication to machines. We can find optimal coding scheme based on the Information Theory when we have the probability of each options. In my experiment, the destinations was ranked based on the algorithm's output. Selecting destiantion with higer rank requires the user doing less activity than the selecting destinations with lower rank.

The algorithm ran on the wheelchair platform shown below built by Dmitry Sinyukov who was a PhD student in WPI.

Simultaneous localization and mapping(SLAM) algorithm from Robotic Operation System(ROS) was used to calculate the realtime location of the wheelchair based on the readings from LIDAR. Then I used Density-based spatial clustering of applications with noise (DBSCAN) algorithm to find several locations the user visited frequently from the user location history. Having all the locations, Baysian Network was trained from the time series data of when the user visited each location. Then, the user's future location can be perdicted by the Baysian Network based on the past location the user have visited.

Algorithm Flow Chart

Find important Locations from User's History Location Data

Clustering for Locations

The prupose of the algorithm is to find the locations the user usually visit. SLAM algorithm generated the map of the whole floor based on the data from LIDAR. At the same time, the wheelchair's location was also calculated. I recorded the wheelchair's location every minute.

To get the locations where the user usually visited, several Unsupervised Clustering algorithms was tried on user's location history including: K-means clustering, Hierarchical clustering, Gaussian Mixture Model(GMM) and Density-based spatial clustering of applications with noise (DBSCAN). Following images show the reuslt of different clustering algorithms. Points with different color belongs to different cluster.

Finally, I used the combination GMM and DBSCAN as the clustering algorithm in the final application. DBSCAN groups together location points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). Then I fitted multiple guassian functions to the density destribution of each cluster as shown in the following image.

Algorithm Result

The first image above is the places the user usually visit. The second image is the result of Density-based clustering algorithm. Different color means different clusters. The ellipses in the second image are the shape of gaussian functions fitted to the density destribution of location points in each cluster.

Find the Relationship between Locations

Bayesian Network

After getting all the important locations. The Baysian Network can be trained based on the time when the user visit and leave each impotant location. The learned Baysian Network looks the image below. T1, T2, T3 and T4 are how long ago does the user visited the location 1, 2, 3 and 4. ToD is the time of the day. The Baysian Network learned from the history data can used those information to predict the probabilities of the user visiting each location.

Block Diagram of How to Train the Bayesian Network

Fun Results

The units of the following images is 30 mins.

The image above shows the relationship between time of the day, how long ago entered the entrance and the probability of going to the office.

The image above shows the relationship between time of the day, how long ago entered the office and the probability of going to the restroom. We can find a funny thing that most of time the user go to the restroom from the office.