Spark for Beginners: Project – Part 1

Français Français

Hello everyone, I’m chained to post a new series dedicated to a Big Data project.

Project’s context

Big data is an evolving term that describes any voluminous amount of structured, semi structured and unstructured data that has the potential to be mined for information.
Big data is often characterized by 3Vs:

  • The extreme volume of data.
  • The wide variety of data types.
  • The velocity at which the data must be processed.

Continue reading Spark for Beginners: Project – Part 1

Spark for Beginners: Tutorials – Spark Twitter analysis with spark SQL – example

Français Français

In this tutorial, we’ll do a simple analysis of sentimental Tweets Spark with SQL on a json file. This exercise is designed in Java to retrieve a stream of Tweets and Scala for spark SQL scripts. You will find the Repo Github link in the tutorial.

Architecture

above illustrates the architecture of our application.

 

Continue reading Spark for Beginners: Tutorials – Spark Twitter analysis with spark SQL – example

Spark for Beginners: Tutorials – Apache Spark Streaming Twitter java example

Français Français

In this chapter, we will walk you through using Spark Streaming to process live tweet streams. Remember, Spark Streaming is a component of Spark that provides highly scalable, fault-tolerant streaming processing. These exercises are designed as standalone Java programs which will receive and process Twitter’s real sample tweet streams. You will find it the Gist Github links in the tutorial.

Create a Twitter developer account

This video Demonstrate how to create a twitter application. First go to https://apps.twitter.com/.



Continue reading Spark for Beginners: Tutorials – Apache Spark Streaming Twitter java example

Spark for Beginners: Tutorials – Connecting To Cassandra

Français Français

Welcome, we will discover in this tutorial how to connecting Spark with Cassandra database using the Java language. The code will be done in Java you will find it the Gist Github links in the tutorial.

Video Demo

Continue reading Spark for Beginners: Tutorials – Connecting To Cassandra

Spark for Beginners: Create Restful API with Java and MongoDB

Français Français

Welcome, we will discover in this tutorial how to create RestFull API with MongoDB as NOSQL database using the Java language. at the end of this tutorial you will be able to create your own API interacting with NOSQL database (mongodb). The code will be done in Java you will find it the Github repo links at the end of the tutorial.

Demo on Youtube

Continue reading Spark for Beginners: Create Restful API with Java and MongoDB

Spark for beginners: Installation on Windows 10

Français Français

Welcome, we will discover in this tutorial the Spark environment and the installation under Windows 10 and we’ll do some testing with Apache Spark to see what makes this Framework and learn to use it. The code for this lab will be done in Java and Scala, which for what we will do is much lighter than Java. Do not worry if you do not know what language we will use only very simple features of Scala, and basic knowledge of functional languages is all you need. If that’s not enough, Google is your friend.

Demo on Youtube

Continue reading Spark for beginners: Installation on Windows 10

Spark for beginners: Introduction

Français Français

What is Spark ?

Apache Spark is a framework of open source for Big Data processing built to perform sophisticated analysis and designed for speed and ease of use. This was originally developed by AMPLab, UC Berkeley University in 2009 and spent as open source Apache project in 2010.

Continue reading Spark for beginners: Introduction