author | title | date | logo | footnote | header-includes |
---|---|---|---|---|---|
ISTD, SUTD |
YARN |
Oct 25, 2021 |
\usepackage{tikz}
\usetikzlibrary{positioning}
\usetikzlibrary{arrows}
\usetikzlibrary{shapes.multipart}
|
By the end of this lesson, you are able to
- Articulate the purpose of Hadoop YARN
- Identify the shortcomings of JobTracker and TaskTracker in Hadoop V1
- Describe how YARN addresses the above issues
- List and differentiate the different YARN schedulers
We have seen
- HDFS
- Hadoop MapReduce
We are looking at Hadoop Resource Management
- Yarn
Your local OS | Data center resource mgmt sys | |
---|---|---|
Goal | Easy to use, isolation | Utilization, isolation |
Scale | 100 |
>1000s jobs per day |
Failure | together, easy to trace | independently, hard to trace |
Users | Single user | Multi-tenant, (concurrently) |
Resources | Small set of resources | Large set of resources |
-
Name node = master node = job tracker is overloaded
- Resource Management
- Scheduling
- Job monitoring
- Job lifecycle
- Fault-tolerance
-
It becomes the bottle neck
- Only scales to < 4K nodes, 40K tasks
- Task tracker = worker node = task tracker
- Fixed mapper slots
- Fixed reducer slots
- Preset at the beginning
- Non-fungible
Yet Another Resource Negotiator
- A general resource management system shipped with Hadoop v2+
- Designed with Hadoop in mind
- addresses many problems in Hadoop v1
- Addresses the problems with Hadoop v1
- Supports not just Hadoop MapReduce jobs
- User-defined sharing policies
- Application Master is running on worker node.
- What if it the worker node hosting the app master is dead?
- Container can run both mapper and reducer tasks
- Client submits an application
- Application Manager (RM) allocates a container to start Application Master
- Application Master registers with Resource Manager
- Application Master asks containers from Resource Manager
- Application Master notifies Node Manager to launch containers
- Application code is executed in the container
- Client contacts Application Master to monitor application status
- Application Master unregisters with Resource Manager
-
Application Master asks for resources
- It could be upfront
- or during the execution.
-
It is tough problem - a multi objective optimization problem
- Capacity guarantee
- SLA
- Fairness
- Cluster Utilization
- ...
- FIFOScheduler
- CapacityScheduler
- FairScheduler
- But users can define their own scheduler
- Single queue
- One job takes up the entire cluster
- First come first serve
- Not good for short jobs queuing behind long jobs
- Statically divided into multiple queues, e.g.
- one for short jobs
- one for long jobs
- A request is submitted to one queue
- Within the queue, FIFO
- Non-static queue
- Policies
- Simple - All jobs have equal share
- Advanced - Jobs in queue A have X% min, jobs in queue B have Y% min.
- When there is no other job running, a job can use up all resources
-
How to enforce fair share?
- Pre-emption - kill resources of existing job
-
How to define a metric for multi-resource request/share?
- Total: 9 CPU, 18GB RAM
- App A: 1 CPU, 4GB RAM
- App B: 3 CPU 1GB RAM
- Dominant Resource Fairness
- Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica
https://cs.stanford.edu/~matei/papers/2011/nsdi_drf.pdf
- Mesos
- Static partitioning of resources
- Mesos master and Mesos Agent
- Multi-framework support through Mesos Agents
- Framework agent ask for resource from mesos master and schedules its own jobs
- Google's Borg
- Large-scale cluster management at Google with Borg, Abhishek
Verma Luis Pedrosa Madhukar R. Korupolu David Oppenheimer Eric
Tune John Wilkes
https://research.google/pubs/pub43438.pdf
- Large-scale cluster management at Google with Borg, Abhishek
Verma Luis Pedrosa Madhukar R. Korupolu David Oppenheimer Eric
Tune John Wilkes
- There is always a new kid on the block
https://preview.tinyurl.com/y3rabllb
https://tinyurl.com/y3rabllb
- Resource Management layer is like your OS
- High utilization is its most important goal
- YARN fixes existing issues with Hadoop v1 resource management
- YARN offers different scheduler policies