Figure 3 is a curve that is computed by dividing movies into bins based on their average ratings. For each bin, we compute the average number of ratings for each movie. In the other words, the chart presents the number of ratings for each movie given the average rating of that movie. From this graph, we can induce that the highly-rated movies are voted more than low-rated movies. On both extremes the number of ratings is very low since people just ignore the worst movies and think that it is unnecessary to rate the best movies, their ranks are so obvious.
Figure 4 presents the number of ratings by each user given an average rating of that user. It is computed by dividing users into bins based on there average ratings. Then for each bin, we compute the average number of ratings. We can imply from this chart that people with less number of ratings usually give negative votes. They did the rating work only when the movies are so disappointed. On the other hand people with a high number of ratings have more variety on rating and their scores are usually from 2.5 to 5. They are truly movie fans. This is also means the rating profiles of users with high average values give us more information and should be exploited by prediction methods.
Figure 5 presents the average standard deviation for each movie given the average rating of that movie. From this graph, we learn that the ratings for the worst and the best movies are quite stable. This can also give us the following idea for rating prediction: for those movies with lowest and highest ratings we can use the average ratings as predicted values and for movies with the middle ratings we can exploit some complex methods for prediction.
Figure 6 which is 1% of the rating subset shows the sparseness of the rating matrix. The percentage of ratings is 1.71% which is a true challenge for any missing value prediction methods.