Apache Zeppelin[1] is quite a handy tool for exploratory analysis of data using Spark, SparkML and SparkSQL.
Sample Visualization in Apache Zeppelin generated using SparkSQL:
Create RDD, dataframe and register as a temp table:
Query your data using SparkSQL:
Use SparkML for analytics:
Once the Spark SQLContext has the data, Zeppelin can be used to visualize the data.
[1] Apache Zeppelin: https://zeppelin.incubator.apache.org/
Sample Visualization in Apache Zeppelin generated using SparkSQL:
Create RDD, dataframe and register as a temp table:
Query your data using SparkSQL:
Use SparkML for analytics:
Once the Spark SQLContext has the data, Zeppelin can be used to visualize the data.
[1] Apache Zeppelin: https://zeppelin.incubator.apache.org/