Yarn ( Yet Another Resource Negotiator) :
The YARN
was introduced basically to split up the functionalities of resource management
and job scheduling or monitoring into separate processes .The Whole idea was to
have a global ResourceManager (RM) and for each application an
ApplicationMaster (AM). An application can be a single job or a DAG of jobs so
that the
MapReduce jobs will run unchanged on top of YARN
with just by recompile.
Resource Manager : There are 2 components in
ResourceManager:
1.Scheduler
2. ApplicationsManager.
Application Manager : The ApplicationsManager
will accept the submitted jobs and ignore the first container for executing the
application specific .Application Master and on failure,it provides the service
for restarting the ApplicationMaster container.Per-application
ApplicationMaster will ignore the appropriate resource containers from the
Scheduler, track their status and monitor their progress.
Scheduler :
The
Scheduler will be mainly for allocating resources to various running
applications keeping the familiar constraints of capacities, queues etc. for
allocation The Scheduler does not perform any monitoring or tracking the status
for the application. It is not responsbible as to why the restarting failed
tasks due to application failure or hardware failures. The Scheduler will
schedule based the resource requirements of the applications. Sheduling is done
based on the abstract conception of a resource container which includes
elements such as memory, cpu, disk, network etc.
Node Manager :
The
Node Manager is the slave which will be many per cluster . Upon starting,the
node manager will send a heartbeat signal to the Resource Manager periodically.
Node Manager offers some resources to the cluster for execution of programs.
Resource capacity is amount of memory and the number of vcores. At run-time,
the Resource Scheduler will decide use this capacity at runtime.Container is a
fraction of the NodeManager capacity and is used by the client for running the
program.
Container :
Container is an allocated resource in the cluster. Set of system
resources like , CPU core , RAM etc are
allocated for each container. It is the sole authority of ResourceManager to
allocate any Container to applications
Application Master :
The
Application Master will be responsible for the execution of a single
application.The Resource Scheduler (Resource Manager) will provide the required containers on which
the specific programs (e.g., the main of a Java class) are executed. The Application Master knows the application
logic and hence it is framework-specific. The MapReduce framework gives its own
implementation of an Application Master.
In YARN,
there are three actors:
o The
Job Submitter (the client)
o The
Resource Manager (the master)
o The
Node Manager (the slave)
Yarn Execution Process :
The
application startup process is the following:
o Mapreduce
Application will be submitted by the client program to the resource manager. It
also provides the information required to launch the application-specific
ApplicationMaster.
o Client
program submits the MapReduce application to the ResourceManager, along with
information to launch the application-specific ApplicationMaster.
o ResourceManager
will negotiate a container for the ApplicationMaster and launches the
ApplicationMaster.
o ApplicationMastervwill
boot and registers itself with the ResourceManager, there by allowing the
original calling client to converse directly with the ApplicationMaster.
o ApplicationMaster will negotiate resources (resource
containers) for client application.
o ApplicationMaster provides the container launch specification to the
NodeManager, which will launche a
container for the application.
o At
the time of execution, client will be polling ApplicationMaster for application
status and progress.
o On
completion, ApplicationMaster will deregister with the ResourceManager and
shuts down and returs its containers to the resource pool.
0 Comments