Spark for Beginners: Tutorials – Spark Twitter analysis with spark SQL – example

Français Français

In this tutorial, we’ll do a simple analysis of sentimental Tweets Spark with SQL on a json file. This exercise is designed in Java to retrieve a stream of Tweets and Scala for spark SQL scripts. You will find the Repo Github link in the tutorial.

Architecture

above illustrates the architecture of our application.

 

Design Phase

 

Note: Phase 3 is not yet implemented. A tutorial will be dedicated to him soon.

Retrieving Tweets by Categories

  • food
  • foodporn
  • recipe
  • cooking
  • healthy
  • cook
  • recipes
  • yummy
  • instafood

Goal

The most used languages for Tweets
SELECT lang, count(*) as c FROM EntertainmentTable WHERE lang is not null GROUP BY lang ORDER BY c desc limit 10

 

Repo Github