You are here

T3.2 Scheduling Analytic Jobs

Lead: NEC

Consider the distributed processing capabilities offered by Hadoop: the cluster of computers supporting its execution lies at the heart of WP3. This creates the need to organize and manage computing resources as they operate in a shared environment. Data-intensive algorithms in mPlane will cover production, recurrent and experimental data analysis jobs. Within mPlane, many “users” (e.g. ISP, regulation authorities, etc.) will share the same global computer cluster, thus avoiding redundancy both in physical deployments and in data storage. Essentially, we will study the problem of resource scheduling, i.e., how to allocate the resources of a cluster to a number of independent jobs that compete for the same resources.

A promising approach, which guarantees both fairness in resource utilization and efficiency, is that of size-based scheduling. This poses many challenges with respect to the traditional problem as: i) the size of a job is very difficult to estimate; ii) data-locality problems (that is, making sure that computation take place on those machines which currently hold relevant data) hinder the task of resource allocation and iii) the scheduling algorithm itself cannot be computationally intensive, due to the number of jobs we expect to run concurrently in the mPlane data analysis module. In this task we address such issues and we will develop scheduling protocols for the Hadoop system: indeed, job scheduling is a “pluggable” module that can replace the default FIFO and FAIR scheduler available off-the-shelf.