Introduction To Hadoop
By Prof. Jeffery Owens
- Release Date: 2012-08-01
- Genre: Computers
The most well known technology used for Big Data is Hadoop. Hadoop is used by Yahoo, eBay, LinkedIn and Facebook. It has been inspired from Google publications on MapReduce, GoogleFS and BigTable.
As Hadoop can be hosted on commodity hardware (usually Intel PC on Linux with one or 2 CPU and a few TB on HDD, without any RAID replication technology), it allows them to store huge quantity of data (petabytes or even more) at very low cost (compared to SAN bay systems).
Table of Contents
Introduction
Common Hadoop Terms
Hype surrounding Hadoop
What is Hadoop
Basic Concept
Installation
Alternate method of Downloading and Installing Hadoop
Installing Hadoop on Mac
Fast Start
Bootstrapping
Browsing to the Services
Example program
Map Reduce
Overview
Programming Model
Map
Example
Types
More Examples
Map Reduce Execution
How Map and Reduce operations are actually carried out
Map
Combine
Reduce
HDFS
Common example operations
Listing files
How to run hadoop - map reduce jobs without a cluster?With cloudera VM.
Trouble Shooting