Thursday, 8 October 2015

Scheduling and disabling Cells

In order to scale OpenStack Cloud Infrastructure at CERN, we were early to embrace an architecture that uses Cells. Cells is a Nova functionality that allows the partition a Cloud Infrastructure into smaller groups with independent control planes.

For large deployments Cells have several advantages like:

  • single endpoint to users; 
  • increase the availability and resilience of the Infrastructure; 
  • avoid that Nova and external components (DBs, message brokers) reach their limits; 
  • isolate different user cases; 
However, cells also have some limitations. There are some nova features that don't work when running cells:
  • Security Groups; 
  • Manage aggregates on Top Cell; 
  • Availability Zone support; 
  • Server groups; 
  • Cell scheduler limited functionality;
There has been many changes since we deployed our initial cells configuration two years ago. During the past months ,there have been a lot of work involving Cells, especially make sure that they are properly tested and developing CellsV2 that should be the default way to deploy Nova in the future.

However, today when using Cells we continue to receive following welcome message :)

"The cells feature of Nova is considered experimental by the OpenStack project because it receives much less testing than the rest of Nova. This may change in the future, but current deployers should be aware that the use of it in production right now may be risky."

At CERN, we now have 26 children cells supporting the 130,000 cores across two data centres in a single cloud. Some cells are dedicated for the general use cases and others that are dedicated only to specific projects.

In order to map projects to cells we developed a scheduler filter for the cell scheduler.

The filter relies in two new values defined in nova.conf: "cells_default" and "cells_projects".

  • "cells_default" contains the set of available cells to schedule instances if the project is not mapped to any specific cell; 
  • "cells_projects" contains the mapping cell -> project for the specific use cases; 


For example, when an instance belonging to "project_uuid2" is created, it's schedule to "cellE". But, if the instance belongs to "project_uuid4" it's schedule to one of the default cells ("cellA", "cellB", "cellC", "cellD").

One of the problems when using cells is that is not possible to disable them from the scheduler.

With this scheduler filter we can achieve this. To disable a cell we just need to remove it from the "cells_default" or" cells_projects" list. Disabling a cell means that it will not be possible to create new instances on it, however it is still available to perform operations like restart, resize, delete, ...

These experiences will be discussed in the upcoming summit in Tokyo with the deep dive into the CERN OpenStack deployment (, at the Ops meetup ( and Nova design sessions (