Page 113 - ASBIRES-2017_Preceedings
P. 113
HADOOP BASED GRAPH ANALYTICS AND DATA ANALYTICS TOOLS ON MASSIVE OPEN
ONLINE COURSES
Table 2: Results by number of 5.1 Limitation of the System
videos watched
Highly memory consumption: There
should be high memory (RAM) capacity to
60% training 70% training 80% training run Hadoop and python. Hadoop runs on
virtual machine. It almost needs 10GB of
RAM to work properly.
MultinomialNB 0.97 0.93 0.93 6 CONCLUSION
BernouliNB 96.67 96.67 96.67 A MOOC tool was developed which
SGDClassifier 6.65 95.32 96 helps to analyze the MOOC’s data. By
Graph analysis, user can view and compare
SVC_classifier 96.77 71.81 95.65 different kind of attributes efficiently.
LinearSVC_clas 96.42 93.67 95.58 Further this system can be used efficiently
sifier to analyze the data with hadoop and hue.
MNB_classifier 96.65 93.25 92.89 Numbers of graph patterns were realized to
Logistic 96.68 95.93 95.06 identify the data of students based on
regression region, subject and duration of the courses.
Data mining technique was used to develop
Further number of days activated a machine learning model to predict the
and number of chapters used were pass rate accuracy. Logistic regression
identified as the highly influenced attributes model was given the highest accuracy with
to the above models. 96.08%. Graph analytics and data analytics
patterns would be utilized in future for
5 DISCUSSION
effective functioning of MOOC systems.
This research project was carried
out to analyze the data through graphs and REFERENCES
data mining techniques. With the help of Gasevic, D, Kovanovic, V, Joksimovic,
this tool, new users can analyze their S., & Siemens, G. (2014). A data analysis
probability to complete the courses. In of the MOOC Research Initiative. The
addition to that data mining techniques International Review of Research in
were used to develop the best model to Operand Distributed Learning, 15(5).
identify most affected attribute for the Mridul, M., Khajuria, A., Dutta S.,
results. Moreover the tool provide attractive Kumar, N. (2014). Analysis of Big Data
and user friendly interfaces for data
manipulation. using Apache Hadoop and Map Reduce.
There is no relevant tool to analyze 4(5).
MOOC’s data. This system is very efficient Patel, A B, Birla M, Nair, U. (6-8 Dec.
and user-friendly system than other tools 2012). Addressing Big Data Problem
used by data mining systems like weka tool. Using Hadoop and Map Reduce.
Those tools have a lot of data analyzing Phaneendra, S.V., & Reddy, E.M. (2013).
techniques. However, graph analysis is Big Data - solutions for RDBMS
th
weak on those systems. Moreover, those problems - A Survey. 12 IEEE/IFIP
tools cannot be customized. Nevertheless, Network Operations & Management
this system can be customized according to Symposium (NOMS 2010), Osaka, Japan
the data set. Apr 19- 23.
Kyong, H., & Lee, H. (2011). Parallel
Data Processing With Map Reduce: A
Survey. SIGMOD Record, 40(4).
103