Data Centre Operations: Optimising Infrastructure for Performance and Reliability


Rethinking infrastructure for the AI era
In this exclusive article for DCNN, Jon Abbott, Technologies Director, Global Strategic Clients at Vertiv, explains how the challenge for operators is no longer simply maintaining uptime; it’s adapting infrastructure fast enough to meet the unpredictable, high-intensity demands of AI workloads: Built for backup, ready for what’s next Artificial intelligence (AI) is changing how facilities are built, powered, cooled and secured. The industry is now facing hard questions about whether existing infrastructure, which has been designed for traditional enterprise or cloud loads, can be successfully upgraded to support the pace and intensity of AI-scale deployments. Data centres are being pushed to adapt quickly and the pressure is mounting from all sides: from soaring power densities to unplanned retrofits, and from tighter build timelines to demands for grid interactivity and physical resilience. What’s clear is that we’ve entered a phase where infrastructure is no longer just about uptime; instead, it’s about responsiveness, integration, and speed. The new shape of demand Today’s AI systems don’t scale in neat, predictable increments; they arrive with sharp step-changes in power draw, heat generation, and equipment footprint. Racks that once averaged under 10kW are being replaced by those consuming 30kW, 40kW, or even 80kW - often in concentrated blocks that push traditional cooling systems to their limits. This represents a physical problem. Heavier and wider AI-optimised racks require new planning for load distribution, cooling systems design, and containment. Many facilities are discovering that the margins they once relied on - in structural tolerance, space planning, or energy headroom - have already evaporated. Cooling strategies, in particular, are under renewed scrutiny. While air cooling continues to serve much of the IT estate, the rise of liquid-cooled AI workloads is accelerating. Rear-door heat exchangers and direct-to-chip cooling systems are no longer reserved for experimental deployments; they are being actively specified for near-term use. Most of these systems do not replace air entirely, but work alongside it. The result is a hybrid cooling environment that demands more precise planning, closer system integration, and a shift in maintenance thinking. Deployment cycles are falling behind One of the most critical tensions AI introduces is the mismatch between innovation cycles and infrastructure timelines. AI models evolve in months, but data centres are typically built over years. This gap creates mounting pressure on procurement, engineering, and operations teams to find faster, lower-risk deployment models. As a result, there is increasing demand for prefabricated and modular systems that can be installed quickly, integrated smoothly, and scaled with less disruption. These approaches are not being adopted to reduce cost; they are being adopted to save time and to de-risk complex commissioning across mechanical and electrical systems. Integrated uninterruptable power supply (UPS) and power distribution units, factory-tested cooling modules, and intelligent control systems are all helping operators compress build timelines while maintaining performance and compliance. Where operators once sought redundancy above all, they are now prioritising responsiveness as well as the ability to flex infrastructure around changing workload patterns. Security matters more when the stakes rise AI infrastructure is expensive, energy-intensive, and often tied to commercially sensitive operations. That puts physical security firmly back on the agenda - not only for hyperscale operators, but also for enterprise and colocation facilities managing high-value compute assets. Modern data centres are now adopting a more layered approach to physical security. It begins with perimeter control, but extends through smart rack-level locking systems, biometric or multi-factor authentication, and role-based access segmentation. For some facilities - especially those serving AI training operations - real-time surveillance and environmental alerting are being integrated directly into operational platforms. The aim is to reduce blind spots between security and infrastructure and to help identify risks before they interrupt service. The invisible fragility of hybrid environments One of the emerging risks in AI-scale facilities is the unintended fragility created by multiple overlapping systems. Cooling loops, power chains, telemetry platforms, and asset tracking tools all work in parallel, but without careful integration, they can fail to provide a coherent operational picture. Hybrid cooling systems may introduce new points of failure that are not always visible to standard monitoring tools. Secondary fluid networks, for instance, must be managed with the same criticality as power infrastructure. If overlooked, they can become weak points in otherwise well-architected environments. Likewise, inconsistent commissioning between systems can lead to drift, incompatibility, and inefficiency. These challenges are prompting many operators to invest in more integrated control platforms that span thermal, electrical, and digital infrastructure. The goal is now to have the ability to see issues and to act quickly - to re-balance loads, adapt cooling, or respond to anomalies in real time. Power systems are evolving too As compute densities rise, so too does energy consumption. Operators are looking at how backup systems can do more than sit idle: UPS fleets are being turned into grid-support assets. Demand response and peak shaving programmes are becoming part of energy strategy. Many data centres are now exploring microgrid models that incorporate renewables, fuel cells, or energy storage to offset demand and reduce reliance on volatile grid supply. What all of this reflects is a shift in mindset. Infrastructure is no longer a fixed investment; it is a dynamic capability - one that must scale, flex, and adapt in real time. Operators who understand this are the best placed to succeed in a fast-moving environment. From resilience to responsiveness The old model of data centre resilience was built around failover and redundancy. Today, resilience also means responsiveness: the ability to handle unexpected load spikes, adjust cooling to new workloads, maintain uptime under tighter energy constraints, and secure physical systems across more fluid operating environments. This shift is already reshaping how data centres are specified, how vendors are selected, and how operators evaluate return on infrastructure investment. What once might have been designed in isolated disciplines - cooling, power, controls, access - is now being engineered as part of a joined-up, system-level operational architecture. Intelligent data centres are not defined by their scale, but by their ability to stay ahead of what’s coming next. For more from Vertiv, click here.

AI infrastructure as Trojan horses for climate infrastructure
Data centres are getting bigger, denser, and more power-hungry than ever before. The rapid rise of artificial intelligence (AI) is accelerating this expansion, driving one of the largest capital build-outs of our time. Left unchecked, hyperscale growth could deepen strains on energy, water, and land - while concentrating economic benefits in just a few regions. But this trajectory isn’t inevitable. In this whitepaper, Shilpika Gautam, CEO and founder of Opna, explores how shifting from training-centric hyperscale facilities to inference-first, modular, and distributed data centres can align AI’s growth with climate resilience and community prosperity. The paper examines: • How right-sized, locally integrated data centres can anchor clean energy projects and strengthen grids through flexible demand, • Opportunities to embed circularity by reusing waste heat and water, and to drive demand for low-carbon materials and carbon removal, and • The need for transparency, contextual siting, and community accountability to ensure measurable, lasting benefits. Decentralised compute decentralises power. By embracing modular, inference-first design, AI infrastructure can become a force for both planetary sustainability and shared prosperity. You can download the whitepaper for yourself by clicking this link.

Duos Edge AI expands edge DC network in Texas
Duos Technologies Group, through its subsidiary Duos Edge AI, a provider of edge data centre (EDC) systems, has deployed a new EDC in Waco, Texas, at the Education Service Centre (ESC) Region 12. The facility marks the company’s sixth site in the state and aims to strengthen digital infrastructure to support local education and community development. The project aims to expand access to reliable, low-latency computing for schools and community partners, providing the resources needed to deliver modern digital learning tools and services. Supporting distributed computing and digital learning The partnership with Region 12 highlights the increasing demand for distributed computing infrastructure to enable K–12 education, workforce development, and regional digital transformation across Texas. The new edge data centre seeks to enhance bandwidth, secure data processing, and local AI capabilities, improving connectivity and efficiency throughout the region. Doug Recker, President of Duos and Duos Edge AI, comments, “This deployment is a significant advancement in our statewide initiative to bring modern computing capabilities closer to where data is created and used. "Our collaboration with Region 12 strengthens the foundation for next-generation learning tools, administrative efficiency, and community connectivity - all powered locally at the edge.” Kenny Berry, Executive Director of ESC Region 12, adds, “Access to real-time, reliable data processing directly supports our educators and students. "This edge data centre brings long-term value to our schools by enabling advanced learning technologies, improving efficiency, and reducing latency across our network.” The deployment forms part of Duos Edge AI’s 2025 strategy to establish 15 operational edge data centres. Each facility features a modular design for rapid deployment, offering scalable compute power and high-speed connectivity in as little as 90 days. These sites, the company says, are helping meet the evolving digital needs of the education, healthcare, and government sectors. For more from Duos Edge AI, click here.

Saudi Arabia’s first integrated data science and AI diploma
DataVolt, a Saudi Arabian developer and operator of sustainable data centres, has partnered with the Energy & Water Academy (EWA) and Innovatics to launch the Kingdom of Saudi Arabia’s first fully industry-integrated Diploma in Data Science and Artificial Intelligence (AI). The new programme is designed to blend academic learning with practical, real-world experience, helping to prepare Saudi talent for the digital economy. Announced at the LEARN event in Riyadh, the Diploma is approved by the Technical and Vocational Training Corporation (TVTC) and College of Excellence (CoE), and endorsed by the Ministry of Communications and Information Technology (MCIT). It is also supported by the Human Resources Development Fund (HRDF). Unlike traditional academic pathways, the programme combines classroom study with applied projects. Students will work with sponsoring companies, including DataVolt, on live industry challenges, developing proof-of-concept AI applications and gaining hands-on experience that directly aligns with workforce needs. As part of its commitment, DataVolt will sponsor five students from the first cohort and guarantee them employment after graduation. They will join the company’s operations supporting high-power-density workloads at its data centres, including its planned AI campus in Oxagon, NEOM. DataVolt is inviting other organisations to co-sponsor the inaugural class of 100 students, with a target of 50% female participation. The first intake is scheduled to begin in November 2025. Industry-led learning for the digital future Rajit Nanda, CEO at DataVolt, says, “DataVolt is not only building the Kingdom’s next-generation data centres, but also the local Saudi talent to power them, ensuring the country is prepared to lead the global AI economy in the long term. "Our investment in this first-of-its-kind Diploma demonstrates our commitment to Vision 2030 and we encourage our partners across the industry to join us in sponsoring the programme and future-proofing the local workforce.” With AI expected to contribute around US$320 billion (£239.1 billion) to the Middle East economy by 2030, and Saudi Arabia set to see the greatest share of that value, the initiative supports the country’s Vision 2030 and the National Strategy for Data and AI (NSDAI). The programme aims to help bridge the national skills gap and contribute to the target of training 20,000 AI professionals over the next five years. Salwa Smaoui, CEO of Innovatics, comments, “This Diploma is not just education; it is a strategic workforce initiative. Our mission is to ensure every graduate is ready to contribute from day one to the Kingdom’s most ambitious AI projects.” Tariq Alshamrani, CEO of EWA, adds, “EWA is proud to partner with DataVolt and Innovatics to deliver this programme. Together, we are developing the next generation of data scientists and AI professionals who will power the Kingdom’s digital future.” DataVolt continues to expand its data centre footprint across Saudi Arabia. Earlier this year, the company signed an agreement with NEOM to design and develop the region’s first sustainable, net-zero AI campus in Oxagon. The first phase of the 1.5 GW development, backed by an initial investment of US$5 billion (£3.7 billion), is expected to begin operations in 2028. For more from DataVolt, click here.

Paving the way for efficient high-density AI at 400G & 800G
AI workloads are reshaping the data centre. As back-end traffic scales and racks densify, the interconnect choices you make today will determine the performance, efficiency, and scalability of tomorrow’s AI infrastructure. In this fast, focused 30-minute live tech talk, Siemon’s experts will share a practical, cabling-led view to help you plan smarter and deploy faster. Drawing on field experience and expectations from large-scale AI deployments, the session will give you clear context and actionable guidance before your next design, upgrade, or AI back-end project begins. Discover: • AI market overview & nomenclature: A clear look at scale-up vs scale-out networks and where each fit in AI planning. • Reference designs & deployment sizes: Common GPU pod approaches (including air-cooled and liquid-to-chip) and what they mean for density and footprint. • AI network connection points: Critical interconnect considerations for high-performance AI back-end networks. • AI network cabling considerations: What to evaluate when selecting cables for demanding 400G/800G workloads. • Cabling options that improve efficiency: Real-world examples of how architecture choices affect deployment efficiency, including a 1024-GPU comparison. Walk away with: • A clear understanding of high-density interconnect options. • Insight into proven deployment strategies and the trade-offs that matter. • Confidence to make informed decisions that scale with AI workloads. Speaker: Ryan Harris, Director, Systems Engineering (High-Speed Interconnect), SiemonDate: Thursday, 2 October 2025Time: 2:00–2:30 PM BST | 3:00–3:30 PM CET This is the must-see tech talk for anyone planning, designing, or deploying high-density AI data centres. Don’t miss your chance to get the insight that can accelerate your next project and keep your infrastructure ready for the demands ahead. Register now via this link to secure your spot. For more from Siemon, click here.

Schneider Electric unveils AI DC reference designs
Schneider Electric, a French multinational specialising in energy management and industrial automation, has announced new data centre reference designs developed with NVIDIA, aimed at supporting AI-ready infrastructure and easing deployment for operators. The designs include integrated power management and liquid cooling controls, with compatibility for NVIDIA Mission Control, the company’s AI factory orchestration software. They also support deployment of NVIDIA GB300 NVL72 racks with densities of up to 142kW per rack. Integrated power and cooling management The first reference design provides a framework for combining power management and liquid cooling systems, including Motivair technologies. It is designed to work with NVIDIA Mission Control to help manage cluster and workload operations. This design can also be used alongside Schneider Electric’s other data centre blueprints for NVIDIA Grace Blackwell systems, allowing operators to manage the power and liquid cooling requirements of accelerated computing clusters. A second reference design sets out a framework for AI factories using NVIDIA GB300 NVL72 racks in a single data hall. It covers four technical areas: facility power, cooling, IT space, and lifecycle software, with versions available under both ANSI and IEC standards. Deployment and performance focus According to Schneider Electric, operators are facing significant challenges in deploying GPU-accelerated AI infrastructure at scale. Its designs are intended to speed up rollout and provide consistency across high-density deployments. Jim Simonelli, Senior Vice President and Chief Technology Officer at Schneider Electric, says, “Schneider Electric is streamlining the process of designing, deploying, and operating advanced, AI infrastructure with its new reference designs. "Our latest reference designs, featuring integrated power management and liquid cooling controls, are future-ready, scalable, and co-engineered with NVIDIA for real-world applications - enabling data centre operators to keep pace with surging demand for AI.” Scott Wallace, Director of Data Centre Engineering at NVIDIA, adds, “We are entering a new era of accelerated computing, where integrated intelligence across power, cooling, and operations will redefine data centre architectures. "With its latest controls reference design, Schneider Electric connects critical infrastructure data with NVIDIA Mission Control, delivering a rigorously validated blueprint that enables AI factory digital twins and empowers operators to optimise advanced accelerated computing infrastructure.” Features of the controls reference design The controls system links operational technology and IT infrastructure using a plug-and-play approach based on the MQTT protocol. It is designed to provide: • Standardised publishing of power management and liquid cooling data for use by AI management software and enterprise systems• Management of redundancy across cooling and power distribution equipment, including coolant distribution units and remote power panels• Guidance on measuring AI rack power profiles, including peak power and quality monitoring Reference design for NVIDIA GB300 NVL72 The NVIDIA GB300 NVL72 reference design supports clusters of up to 142kW per rack. A data hall based on this design can accommodate three clusters powered by up to 1,152 GPUs, using liquid-to-liquid coolant distribution units and high-temperature chillers. The design incorporates Schneider Electric’s ETAP and EcoStruxure IT Design CFD models, enabling operators to create digital twins for testing and optimisation. It builds on earlier blueprints for the NVIDIA GB200 NVL72, reflecting Schneider Electric’s ongoing collaboration with NVIDIA. The company now offers nine AI reference designs covering a range of scenarios, from prefabricated modules and retrofits to purpose-built facilities for NVIDIA GB200 and GB300 NVL72 clusters. For more from Schneider Electric, click here.

Duos deploys fifth edge data centre
Duos Technologies Group, through its operating subsidiary Duos Edge AI, a provider of adaptive, versatile and streamlined edge data centre (EDC) solutions tailored to meet evolving needs in any environment, has announced its latest data centre deployment towards its anticipated goal of 15 deployments by the year's end. The latest EDC is in partnership with Dumas Independent School District to deploy an on-premise EDC in Dumas, Texas. This project marks another milestone in Duos Edge AI’s expansion into rural communities, providing low-latency compute and connectivity that directly support K-12 education and regional growth. The Dumas ISD edge data centre will serve as a localised hub for real-time data processing, enabling advanced educational tools, stronger digital infrastructure, and improved connectivity for students and staff across the district. “As Director of Information Technology for Dumas ISD, I am excited about our partnership with Duos Edge AI,” says Raymond Brady, Director of Information Technology at Dumas ISD. “This collaboration brings direct, on premise access to a cutting-edge data centre, an extraordinary opportunity for a rural community like Dumas. It will significantly strengthen the district’s technology capabilities and support our mission of achieving academic excellence through collaboration with students, parents, and the community. I look forward to working with Duos Edge AI as we continue to provide innovative technology for our students and staff, ensuring every student is prepared for success.” “This partnership with Dumas ISD is a perfect example of how edge technology can create lasting impact in rural communities,” adds Doug Recker, newly appointed President of Duos Technologies Group and the founder of subsidiary, Duos Edge AI. “By placing powerful computing infrastructure directly on campus, we’re helping schools like Dumas unlock real-time digital tools that drive student achievement, workforce readiness, and community growth.” This deployment is part of Duos Edge AI’s broader 2025 plan to establish 15 modular EDCs nationwide, with a focus on underserved and high-growth markets. By locating advanced computing infrastructure closer to end users, Duos Edge AI ensures reliable, secure, and scalable technological access for schools, healthcare facilities, and local communities. For more from Duos Edge, click here.

Cadence adds NVIDIA DGX SuperPOD to digital twin platform
Cadence, a developer of electronic design automation software, has expanded its Reality Digital Twin Platform library with a digital model of NVIDIA’s DGX SuperPOD with DGX GB200 systems. The addition is aimed at supporting data centre designers and operators in planning and managing facilities for large-scale AI workloads. The Reality Digital Twin Platform enables users to create detailed digital replicas of data centres, simulating power, cooling, space, and performance requirements before physical deployment. By adding the NVIDIA DGX SuperPOD, Cadence says engineers can model AI factory environments with greater accuracy, supporting faster deployment and improved operational efficiency. Digital twins for AI data centres Michael Jackson, Senior Vice President of System Design and Analysis at Cadence, says, “Rapidly scaling AI requires confidence that you can meet your design requirements with the target equipment and utilities. "With the addition of a digital model of NVIDIA’s DGX SuperPOD with DGX GB200 systems to our Cadence Reality Digital Twin Platform library, designers can model behaviourally accurate simulations of some of the most powerful accelerated systems in the world, reducing design time and improving decision-making accuracy for mission-critical projects.” Tim Costa, General Manager of Industrial and Computational Engineering at NVIDIA, adds, “Creating the digital twin of our DGX SuperPOD with DGX GB200 systems is an important step in enabling the ecosystem to accelerate AI factory buildouts. "This step in our ongoing collaboration with Cadence fills a crucial need as the pace of innovation increases and time-to-service shrinks.” The Cadence Reality Digital Twin Platform allows engineers to drag and drop vendor-provided models into simulations to design and test data centres. It can also be used to evaluate upgrade paths, failure scenarios, and long-term performance. The library currently contains more than 14,000 items from over 750 vendors. Industry engagement The addition of the NVIDIA model is part of Cadence’s ongoing collaboration with NVIDIA, following earlier support for the NVIDIA Omniverse blueprint for AI factory design. Cadence will highlight the expanded platform at the AI Infra Summit in Santa Clara from 9-11 September, where company experts will take part in keynotes, panels, and talks on chip efficiency and simulation-driven data centre operations. For more from Cadence, click here.

Nokia, Supermicro partner for AI-optimised DC networking
Finnish telecommunications company Nokia has entered into a partnership with Supermicro, a provider of application-optimised IT systems, to deliver integrated networking platforms designed for AI, high-performance computing (HPC), and cloud workloads. The collaboration combines Supermicro’s advanced switching hardware with Nokia’s data centre automation and network operating system for cloud providers, hyperscalers, enterprises, and communications service providers (CSPs). Building networks for AI-era workloads Data centres are under increasing pressure from the rising scale and intensity of AI and cloud applications. Meeting these requirements demands a shift in architecture that places networking at the core, with greater emphasis on performance, scalability, and automation. The joint offering integrates Supermicro’s 800G Ethernet switching platforms with Nokia’s Service Router Linux (SR Linux) Network Operating System (NOS) and Event-Driven Automation (EDA). Together, these form an infrastructure platform that automates the entire network lifecycle - from initial design through deployment and ongoing operations. According to the companies, customers will benefit from "a pre-validated solution that shortens deployment timelines, reduces operational costs, and improves network efficiency." Industry perspectives Cenly Chen, Chief Growth Officer, Senior Vice President, and Managing Director at Supermicro, says, "This collaboration gives our customers more choice and flexibility in how they build their infrastructure, with the confidence that Nokia’s SR Linux and EDA are tightly integrated with our systems. "It strengthens our ability to deliver networked compute architectures for high-performance workloads, while simplifying orchestration and automation with a unified platform." Vach Kompella, Senior Vice President and General Manager of the IP Networks Division at Nokia, adds, "Partnering with Supermicro further validates Nokia SR Linux and Event-Driven Automation as the right software foundation for today’s data centre and IP networks. "It also gives us significantly greater reach into the enterprise market through Supermicro’s extensive channels and direct sales, aligning with our strategy to expand in cloud, HPC, and AI-driven infrastructure." For more from Nokia, click here.

Duos Edge AI awarded patent for modular DC entryway
The US Patent and Trademark Office has granted Duos Edge AI, a provider of edge data centre (EDC) systems, a patent for a new entryway design for modular data centres. The system aims to improve security and protect mission-critical equipment by combining a two-door access configuration with filtration to reduce the intrusion of dust, dirt, and moisture. Duos Edge AI, a subsidiary of Duos Technologies Group, develops modular edge data centres intended to provide reliable, low-latency data access in areas where traditional infrastructure is limited. The patented entryway is designed to support these facilities in remote or rural locations by improving equipment resilience and service uptime. Supporting communities The company’s edge data centres are used by schools, hospitals, warehouses, carriers, and first responders. By enhancing environmental protection for infrastructure, the new design is expected to strengthen operational continuity in sectors that depend on constant access to digital services. Doug Recker, President and founder of Duos Edge AI, says, "This patent demonstrates our commitment to delivering ruggedised, field-ready edge data centres that meet the unique needs of rural and underserved markets. "By addressing critical challenges like environmental intrusion, we are setting a higher standard for reliability and long-term value for our customers." The modular approach aligns with Duos Edge AI’s wider focus on delivering scalable, rapidly deployable facilities that move data processing closer to users. This can help reduce latency, support real-time applications, and expand digital access in regions with growing demand. For more from Duos Edge AI, click here.



Translate »