You are here

Hadoop Fair Sojourn Protocol

Description:

The Hadoop Fair Sojourn Protocol Scheduler (HFSP, avaialble on github) is a size-based scheduler for Hadoop that exploits approximate job size estimation to obtain response times that outperform the current state of the art (e.g., processor sharing).

 

Size-based scheduling with aging has, for long, been recognized as an effective approach to guarantee fairness and near-optimal system response times. HFSP introduces this technique to a real, multi-server, complex and widely used system such as Hadoop. Size-based scheduling requires a priori job size information, which is not available in Hadoop: HFSP builds such knowledge by estimating it on-line during job execution and it is largely tolerant to job size estimation errors.

HFSP can deal with estimation errors by leveraging on the fact that the Hadoop framework provides information about job progression: therefore, HFSP starts with a irst rough estimation of job size based simply on the number of map/reduce tasks in a job, but refiines this estimation once a few tasks for that job are completed.

 
SchedSim is a simulator for evaluating the impact of errors in estimating the size when performing size-based scheduling in big-data workloads. Details in a technical report.

Please, refer to Deliverable D3.3 for a full description of the HFSP Scheduler.

Quick start:

Please, refer to the quickstart on github for the scheduler and on bitbucket for the simulator.

 

New features supported by the mPlane project

The HFSP Scheduler has been completely built in projects mPlane and BigFoot. 

mPlane proxy interface

None

Official version
  • Aug 23, 2014. Deliverable D3.3.
  • July 15, 2015. Deliverable D3.4.