The School of Computer Science (SCS) facilities are currently undergoing major renovations, including the entire HP5100 wing of the Herzberg Building. To allow construction work to proceed, the wing must be temporarily vacated and relocated. 

One of the most significant challenges in this process is moving the school’s server room facility. This infrastructure supports departmental servers, specialized research equipment, and the OpenStack cloud platform used by the school. 

SCS server room relocation led by Andrew Pullin

SCS server room relocation led by Andrew Pullin

The stakes are high. More than 2,000 undergraduate students rely on the OpenStack cloud for course assignments and laboratory work, with usage peaking during the fall and winter academic terms. At the same time, graduate students and researchers depend on the system around the clock to run Computer Science and Data Science experiments, simulations, and long-running computational workloads. 

This creates a difficult question: 

  • How do you relocate a critical server facility in the middle of the winter term while supporting more than 2,000 students, faculty, and staff who depend on it 24/7, when the infrastructure runs one of the most complex cloud platforms ever created: OpenStack?

The School of Computer Science partnered with the university’s Information Technology Services (ITS), which offered temporary server space in the Carleton Library to host the infrastructure during the renovation. 

With a relocation site secured, the team considered two primary strategies for moving the server facility. 

  

Option 1: Full Shutdown and Rapid Relocation 

  • Under this approach, the entire server room would be powered down, physically moved to the new location in a single day, and then reassembled and brought back online. 
  • The advantage of this strategy is speed. In the best-case scenario, the move could be completed within a day, resulting in only one to two days of downtime for users. 
  • However, the risks were significant. If multiple servers failed to start after the move, or if a critical infrastructure node encountered problems, the entire OpenStack environment could remain offline for an extended period. In a worst-case scenario, service outages could stretch into days or even weeks, while systems were repaired and reconfigured. 

  

Option 2: Live Migration and Incremental Relocation 

  •  The second option involved a slower, more deliberate process: migrating servers individually while gradually relocating hardware to the new facility. 
  • Although this approach would take considerably longer, it offered a key advantage. Each server could be handled carefully and validated before proceeding to the next. If a problem occurred, it would affect only a single system rather than the entire infrastructure. 
  • This incremental strategy significantly reduced the risk of a prolonged outage and ensured the OpenStack cloud could remain operational throughout the relocation. 

 

Option 1 had the advantage of speed. The entire server facility could be relocated in a single day, with an estimated downtime of only one to two days in the best-case scenario. However, the risks were substantial. If several servers failed to start after the move, or if a critical infrastructure node experienced problems, the OpenStack cloud could remain offline for days or even weeks, while systems were repaired and restored. 

Option 2 was a slower, more methodical approach. Servers would be moved carefully in small batches, allowing each system to be verified before proceeding. If a problem occurred with a particular server, it could be isolated and resolved without jeopardizing the stability of the entire cloud environment. 

The SCS technical staff ultimately chose this incremental strategy. The effort was led by Andrew Pullin, who coordinated the migration plan and oversaw the relocation process.

Before any hardware was moved, the team first ensured the necessary network infrastructure was in place. The SCS subnet was extended between the Herzberg Building and the Carleton Library, effectively spanning both locations. Because the OpenStack cloud requires its infrastructure nodes to reside on the same subnet, this network configuration was critical. 

With the subnet extended across both buildings, OpenStack could treat servers located in Herzberg and those in the library as part of the same environment, regardless of the physical distance between them. This allowed systems to be relocated gradually while remaining fully integrated with the existing cloud infrastructure. 

SCS tech staff: Karim Ismail and Andrew Pullin configuring a GPU server

SCS tech staff: Karim Ismail and Andrew Pullin configuring a GPU server

The OpenStack environment runs on a virtualized cloud infrastructure, where workloads exist as server images rather than being tied to specific physical machines. This architecture proved to be a major advantage during the relocation. 

Virtual machine images could be migrated to the library facility ahead of the physical move. Each night, servers in Herzberg copied their images to the library location. By morning, once the workloads had successfully migrated, the now-vacant physical server in Herzberg could be safely powered down, removed, and transported to the new facility. 

This approach significantly reduced risk. If an issue occurred during migration, it would affect only a single server rather than the entire cloud environment, allowing problems to be isolated and resolved without disrupting the broader system. 

Winter, however, introduced an entirely different challenge. 

This year Ottawa experienced a particularly harsh and snowy winter, so much so that the Rideau Canal remained open for 56 days of skating (that’s unusually long). Moving more than 55 servers across campus in freezing conditions is not a trivial task. 

Rideau skating canal

Fortunately, Carleton University has a unique advantage: its extensive underground tunnel system. 

Using a golf cart and trailer, the team transported servers through the tunnels in small batches, safely moving equipment between buildings without exposure to the winter weather. What might have been a logistical nightmare outdoors became a surprisingly efficient relocation route beneath the campus. 

Over the course of less than two months, Andrew Pullin and the SCS technical team successfully migrated the entire environment during the middle of the winter academic term. The move included: 

  • 4 racks of equipment
  • 25 compute nodes totaling 1,672 CPU cores 
  • 24 GPU servers containing 138 GPUs 
  • The full OpenStack infrastructure stack 
  • Supporting networking and storage systems 

In the end, the project lived up to its title: the cloud was successfully moved across campus during the winter term and it never had to shut down.

Server room relocation timelapse

Server room relocation timelapse: front and back of each of the 4 racks

Modern network administration and virtualization technologies made this complex relocation possible. 

Of course, there is one small catch. 

When the renovations are finished… everything will have to be moved back.