Thursday, April 24, 2025

Artificial Intelligence


AddOn Networks new transceiver range delivers optimal performance
A new range of 800G transceivers has been launched by AddOn Networks to enable Artificial Intelligence (AI) and Machine Learning (ML) tools and greater automation in data centre applications. Supporting both InfiniBand and Ethernet, the range has been designed to offer customers an attractive alternative to Network Equipment Manufacturers (NEMs) when it comes to building and maintaining enhanced optical networks. Data centre operators need extensive data, storage and compute capabilities to ensure the enhanced speeds and low latency required to overcome growing industry demands. AI and ML tools optimise existing infrastructure and establish accurate data analytics for faster decision-making and increased automation. Yet, the amount of bandwidth necessary for these to operate successfully, and the lack of compatible solutions in the market, makes selecting the right product a challenge. With this product launch, AddOn Networks will provide operators with the means to optimise their existing networks, alongside a premium support service and a reduction in part lead times. “This family of transceivers mark an exciting new era for AddOn Networks,” says AddOn Networks’ Director of Product Line Management, Ray Hagen. “Businesses operating in the data centre industry may already be aware of the benefits of 800G transceivers, but when ordering these directly through NEMs, they often experience long lead times and unnecessary delays in delivery. With our new range of transceivers, we will compress the timeline from order to delivery to ensure customers get solutions exactly when they require them. As a result, operators can maximise data centre output through AI and ML tools while making essential cost and time savings.” The transceiver range will enhance the handling and processing of bandwidth-intensive data flows when migrating from serial Central Processing Unit (CPU) based architecture towards parallel data flow in the Graphics Processing Unit (GPU). Once implemented within existing infrastructure, the transmission of data is accelerated, with additional capabilities to increase storage and compute capacity available to operators. “Our 800G transceivers are parallel tested in our customers’ environments to ensure performance matches what is offered by the NEMs,” continues Ray. “Our global leadership in third-party optics and expertise in testing means we can offer our customers not just a best-in-class transceiver, but a best-in-class service too. Our round-the-clock support puts customers in the best possible position to meet the growing demands of the industry. This launch reflects our commitment to introducing key solutions through multiple platforms to best serve those customers, as we move into the realm of AI and ML.” AddOn Networks carries out 100% testing in its laboratory to guarantee the family of transceivers mirror the customers’ data flows and their front-end environments to ensure full compatibility and performance while offering a lifetime warranty. As a result, customers can benefit from adaptable and reliable products, tailored to meet the specific demands of their data centre. More information on the products can be found on the AddOn website.

What will AI computing look like?
By Ed Ansett, Founder and Chairman, i3 Solutions Group Wherever generative AI is deployed, it will change the IT ecosystem within the data centre. From processing to memory, networking to storage, and systems architecture to systems management, no layer of the IT stack will remain unaffected. For those on the engineering side of data centre operations tasked with providing the power and cooling to keep AI servers operating both within existing data centres and in dedicated new facilities, the impact will be played out over the next 12 to 18 months. Starting with the most fundamental IT change - and the one that has been enjoying the most publicity - AI is closely associated with the use of GPUs (Graphics Processing Units). The GPU maker, Nvidia, has been the greatest beneficiary, according to Reuters, “Analysts estimate that Nvidia has captured roughly 80% of the AI chip market. Nvidia does not break out its AI revenue, but a significant portion is captured in the company's data centre segment. So far in 2023, Nvidia has reported data centre revenue of $29.12bn.” But even within the GPU universe, it will not be a case of one size or one architecture fits every AI deployment in every data centre. GPU accelerators built for HPC and AI are common, as are Field Programmable Gate Arrays, adaptive System on Chips (SOCs) or ‘smart mics’ and highly dense CUDA (Compute Unified Device Architectures) GPUs. An analysis from the Centre for Security and Emerging Technology, entitled, AI chips, what they are and why they matter, says, “Different types of AI chips are useful for different tasks. GPUs are most often used for initially training and refining AI algorithms. FPGAs are mostly used to apply trained AI algorithms to "inference". ASICs can be designed for either training or inference.” As with all things AI, what’s happening at chip level is an area of rapid development where there is growing competition between with traditional chip makers, cloud operators and new market entrants who are racing to produce chips for their own use, for the mass market or both. As an example of disruption in the chip market, AWS announced ‘Trainium2’ as a next generation chip designed for training AI systems in Summer 2023. The company proclaimed the new chip to be four times faster while using half the energy of its predecessor. Elsewhere, firms such as ARM are working with cloud providers to produce chips for AI, while AMD has invested billions of dollars in AI chip R&D. Intel, the world’s largest chip maker is not standing still. Its product roadmap announced in December 2023 was almost entirely focused on AI processors from PCs to servers. Why more GPU servers? The reason for the chip boom is the sheer number of power-hungry GPUs (Graphics Processing Units) or TPUs (Tensor Processing Units were developed by Google specifically for AI workloads) needed for generative AI workloads. A single AI model will run across hundreds of thousands of processing cores in ten of thousands of servers mounted in racks drawing 60-100kW per rack. As AI use scales and expands, this kind of rack power density will be common. The power and cooling implications for data centres are clear. There are several factors that set GPU servers apart from other types of server. According to Run:ai, these include, “Parallelism: GPUs consist of thousands of small cores optimised for simultaneous execution of multiple tasks. This enables them to process large volumes of data more efficiently than CPUs with fewer but larger cores.“Floating-point performance: The high-performance floating-point arithmetic capabilities in GPUs make them well-suited for scientific simulations and numerical computations commonly found in AI workloads. “Data transfer speeds: Modern GPUs come equipped with high-speed memory interfaces like GDDR6 or HBM2 which allow faster data transfer between the processor and memory compared to conventional DDR4 RAM used by most CPU-based systems.” Parallel processing – AI computing and AI supercomputing Like AI supercomputing, traditional supercomputing runs on parallel processing using neural networks. Parallel processing is using more than one microprocessor to handle separate parts of an overall task. It was first used in traditional supercomputers – machines where vast arrays of traditional CPU servers with 100,000s of processors are set up as a single machine within a standalone data centre. Because GPUs were invented to handle graphics rendering their chip parallel architecture makes them more suitable for breaking down complex tasks and working on them simultaneously. It is the nature of all AI that large tasks need to be broken down. The announcements about AI supercomputers from the cloud providers and other AI companies are revealing: Google is saying it will build for customers its AI A3 supercomputers of 26,000 GPUs, pushing 26 exaFLOPS of AI throughput. AWS said it will build GPU clusters called Ultrascale, that will deliver 20 exaFLOPS. Inflection AI said its AI cluster will consist of 22,000 NVIDIA H100 Tensor Core GPUs. With such emphasis on GPU supercomputers, you could be forgiven for thinking all AI will only run on GPUs in the cloud. In fact, AI will reside not just in the cloud but across all types of existing data centres and on different types of server hardware. Intel points out, “Bringing GPUs into your data centre environment is not without challenges. These high-performance tools demand more energy and space. They also create dramatically higher heat levels as they operate. These factors impact your data centre infrastructure and can raise power costs or create reliability problems.” Of course, Intel wants to protect its dominance in the server CPU market. But the broader point is that data centre operators must prepare for an even greater mix of IT equipment living under the one roof. For data centre designers, even where being asked to accommodate running several thousand GPU machines at relatively small scale within an existing facility, be prepared to find more power and take more heat.

How AI is changing data centre infrastructure
By Geoff Dear, Technical Manager UK & Ireland, and Fatholah Zaki, LAN & Data Centre Sales Manager, Reichle & De-Massari UK AI is significantly influencing data centre design and operation in several ways. What’s more, it can be used to drive data centre efficiencies, enhance performance, and introduce new capabilities. Infrastructure and customer requirements are changing with the need to support the vast global uptake of AI. Making well-informed decisions about data centre infrastructure to accommodate this is vital, and choices made now will affect performance and business for years to come! According to Forbes Advisor, over 60% of business owners believe AI will increase productivity and improve customer relationships. Anticipated growth for Europe’s AI software market between 2021 and 2028 is an impressive 40%. According to a report from Accenture, AI could add an additional USD$814bn (£630bn) to the UK economy by 2035. The AI as a Service (AIaaS) market, forecasted by Research and Markets, is predicted to expand from $9.86bn in 2023 to $14.27bn in 2024 (a remarkable compound annual growth rate of 44.7%) and reach $63.2bn by 2028. Enabling migration from CPUs to GPUs To support AI usage, data centres require specialised hardware for efficiently processing highly complex computations: GPUs (Graphics Processing Units). These consist of hundreds or thousands of smaller cores that are optimised for handling multiple tasks simultaneously. As the name suggests, GPUs were originally designed to render graphics and carry out image processing tasks. However, their capabilities have expanded over the years. GPUs are currently used for general-purpose computing tasks that require significant parallel processing power, such as data analysis and machine learning. Compared to ‘traditional’ CPU-based installation, GPU racks have significantly more cores than CPUs, consume more power, emit more heat, and occupy more space. In fact, an anticipated 10-fold increase in the number of processors in the same footprint. As a result, solutions need to be developed to enable connectivity for the vastly increasing numbers of in GPUs. Integrated very small form factor (VSFF) connectors can play a very significant part in solving this challenge. VSFF SN and CS connectors are characterised by their small size but huge capacity for fibre connections. They are developed to allow for port breakout at high speeds, such as 400G, which is becoming increasingly common in data centre environments, as networks expand to support larger volumes of data transmission. Use of VSFF connectors can increase the fibre count to 432 in a 1U space - a significant leap from the 144 fibres accommodated by LC duplex connectors. Using these connectors in data centres allows for the deployment of a significantly larger number of GPUs by maximizing the use of available space and improving the efficiency of connections. Power and cooling Integration of AI accelerators also necessitates changes in rack designs and power distribution. Data centre designers need to work out kilowatt requirements per rack and where fluctuations might occur. Additional power and cooling systems are needed to handle higher power densities and thermal loads. It will also be important to determine where air cooling will suffice, and where liquid cooling can bring the greatest benefit. Liquid can conduct heat more effectively than air and is suited to growing data centre equipment densities. There are several ways of applying liquid cooling - from heat exchangers in rack doors to immersion cooling systems that completely submerge rack servers and other components in a dielectric liquid that doesn’t conduct electricity - but does conduct heat. This is the most energy-efficient and sustainable form of liquid cooling, making optimum use of liquid’s thermal transfer properties. Heat can also be reused. However, it’s important to consider the fact that this approach affects connectivity, as well as the costs, requirements and skillsets involved. The increased power and cooling requirements driven by AI can boost carbon emissions, while at the same time data centres everywhere are trying to get a firmer grip on their PUE. That makes monitoring, mitigation and process optimisation more important than ever. When it comes to power and cooling, AI introduces a challenge – but also provides part of the solution. Incorporating AI into data centre asset management can enhance resource utilisation and decision-making. Data centre design and building can be optimised using ‘Digital Twins’ and simulation modelling. AI algorithms and overlays displaying real-time energy and efficiency metrics can optimise cooling, power, and resource allocation. AI can analyse usage patterns and historical data to predict future asset needs, enabling better planning for procurement and replacement. However, successful implementation requires careful analysis, planning and integration with existing systems. Holistic approach It’s important that redesign is taken up as a cross-departmental effort, avoiding technology siloes. A holistic approach looking at every part of the data centre and its unique requirements is vital. What’s more, flexibility is key. A modular approach will help accommodate future hardware and capacity requirements which are impossible to predict right now. A well-designed structured cabling system increases uptime, scalability, and return on investment while reducing the technology footprint and operating expenses. This ensures all connections are standardised and organised, which is vital for maintaining the high performance and reliability required by GPU-intensive operations. Structured cabling, characterised by adherence to predefined standards with pre-set connection points and pathways, supports efficient system growth and change management, crucial for evolving GPU requirements with enormous throughput requirements. R&M will present its data centre offering at the Data Centre World in London. Read more here.

DataVita launches new service in response to exponential AI demand
DataVita, Scotland’s data centre and cloud services provider, has invested in new infrastructure and capabilities to host the high-performance computing (HPC) workloads, supporting the exponential growth of artificial intelligence (AI) and machine learning. This marks a significant milestone, allowing for the hosting of high-density workloads at DataVita's DV1 facility in Lanarkshire. The facility now boasts the capacity to accommodate up to 100kW per rack for air cooling and up to 400kW per rack for liquid cooling. This enhancement significantly exceeds the capabilities of standard racks, providing essential support for the requirements of HPC, and represents a major leap forward for the Scottish data centre market. According to the US International Trade Administration, the UK’s AI market is currently valued at over £16.9bn and it is estimated to add £803.7bn to the UK economy by 2035. Alongside the accelerated adoption of generative AI models such as ChatGPT over the last year, DataVita says it has witnessed a huge surge in the volume of enquiries for high-capacity hosting and is already in talks with a number of globally significant tech providers. The higher proportion of renewable sources in Scotland's energy mix means there is a much lower carbon footprint associated with hosting data centres in the country compared to the rest of the UK and other nations. Relocating a 200-rack facility from London to Scotland would save over 6 million kgCO2e, equivalent to over 14 million miles driven by the average mid-sized car. Compared to Poland, it would reduce carbon emissions by 99%. Scotland generated more renewable power than it used for the first time in 2022, and therefore, has plenty of capacity to host the data needs of AI. Organisations could also save up to 70% on their data centre costs because of factors such as the country’s natural climate, which reduces the need for additional cooling.  Danny Quinn, MD of DataVita, says, “AI is one of the fastest growing sectors of technology and could have huge benefits for businesses, as well as public services and the well-being of citizens who use them. However, to support its widespread use we need to have the infrastructure in place to underpin the advanced computing power and data it requires. “While other European nations are struggling with power and capacity, Scotland has a surplus of renewable energy that could be used to power this new and exciting technology that everyone is talking about. We see a big opportunity tied to the growing global demand, which is why we have redesigned elements of our DV1 facility to match the needs of AI and HPC providers. “The location is ideal for companies aiming to reduce the carbon footprint of their IT provision while maintaining unmatched resilience, security, power and connectivity. By using Scotland’s natural resources and existing renewable energy infrastructure, we are proving that increasing AI data workloads does not need to come at the expense of the environment.”

Singtel collaborates with NVIDIA to bring AI to Singapore
Singtel has joined the NVIDIA Partner Network Cloud Programme and will bring NVIDIA’s full-stack AI platform to enterprises across the region. This NVIDIA-powered AI Cloud will be hosted by Singtel’s Nxera regional data centre business, which will be developing a new generation of sustainable, hyper-connected AI-ready data centres. Bill Chang, CEO of Nxera and Singtel's Digital InfraCo unit, says, “We are pleased to collaborate with AI leader, NVIDIA, to deliver AI infrastructure services, democratising access for enterprises, start-ups, government agencies and research organisations to leverage the power of AI sustainably within our purpose-built AI data centres. This tie-up provides an easy on-ramp for enterprises across all industries which can accelerate their development of generative AI, large language models, AI finetuning and other AI workloads. Together with our sustainable AI data centres, 5G network platform and our AI cluster with NVIDIA, these form part of our next-generation digital infrastructure to support AI adoption and digital transformation in Singapore and the region.”  Ronnie Vasishta, SVP for Telecom at NVIDIA, says, “Our collaboration with Singtel will combine technologies and expertise to facilitate the development of robust AI infrastructure in Singapore and throughout the region. In addition to supporting the Singapore government’s AI strategy, this will empower enterprises, regional customers and start-ups with advanced AI capabilities.”  Singtel's upcoming hyper-connected green 58MW data centre Tuas in Singapore will be one of the first data centres that will be AI ready, when it comes into operation in early 2026. Data centre Tuas will offer a high- density environment suited for AI workloads and will operate at a Power Usage Effectiveness (PUE) of 1.23 at full load, making it one of the most efficient in the industry. Besides data centre Tuas, Singtel is developing two other data centre projects in Indonesia and Thailand. The NVIDIA GPU clusters will run in existing upgraded data centres initially and can be further expanded into the new AI DCs when they are ready for service.

Growth of AI creates unprecedented demand for global data centres
As the global economy continues to rapidly adopt Artificial Intelligence (AI), infrastructure to support these systems must keep pace. Consumers and businesses are expected to generate twice as much data in the next five years as all the data created over the past 10 years. This growth presents both an opportunity and a challenge for real estate investors, developers and operators. JLL’s Data Centers 2024 Global Outlook explores how data centres need to be designed, operated and sourced to meet the evolving needs of the global economy. With the growing demands of AI, data centre storage capacity is expected to grow from 10.1 zettabytes (ZB) in 2023 to 21.0 zettabytes in 2027, for a five-year compound annual growth rate of 18.5%. Not only will this increased storage generate a need for more data centres, but generative AI’s greater energy requirements, ranging from 300 to 500MW+, will also require more energy efficient designs and locations. The need for more power will require data centre operators to increase efficiency and work with local governments to find sustainable energy sources to support data centre needs. “As the data centre industry grapples with power challenges and the urgent need for sustainable energy, strategic site selection becomes paramount in ensuring operational scalability and meeting environmental goals,” says Jonathan Kinsey, EMEA Lead and Global Chair, Data Centre Solutions, JLL. “In many cases, existing grid infrastructure will struggle to support the global shift to electrification and the expansion of critical digital infrastructure, making it increasingly important for real estate professionals and developers to work hand in hand with partners to secure adequate future power.” Sustainable data centre design and operations solutions AI-specialised data centres look different than conventional facilities and may require operators to plan, design and allocate power resources based on the type of data processed or stage of generative AI development. As the amount of computing equipment installed and operated is expected to continue increasing with AI demand, heat generation will surpass current standards. Since cooling typically accounts for roughly 40% of an average data centre’s electricity use, operators are shifting from traditional air-based cooling methods to liquid cooling. Providers have shown that liquid cooling boasts significant power reductions, as high as 90%, while improving capability and space requirements. “In addition to location and design considerations, data centre operators are starting to explore alternative power sourcing strategies for onsite power generation including small modular reactors (SMRs), hydrogen fuel cells and natural gas,” says Andy Cvengros, Managing Director, US Data Centre Markets, JLL. “With power grids becoming effectively tapped out and transformers having more than three-year lead times, operators will need to innovate.” Global investment in data centres and power To support these requirements, critical changes need to be made across the globe to increase power usage. In Europe, one-third of the grid infrastructure is over 40 years old, requiring an estimated €584bn of investment by 2030 to meet the European Union’s green goals. In the United States, meeting energy transition goals to upgrade the grid and feed more renewable energy into the power supply will require an estimated $2 trillion. Data centres’ rapid growth is also putting pressure on limited energy resources in many countries. In Singapore, for example, the government enacted a moratorium to temporarily halt construction in certain regions to carefully review new data centre proposals and ensure alignment with the country’s sustainability goals. The global energy conundrum presents both opportunities and challenges to commercial real estate leaders with a stake in the data centre sector. Generative AI will continue to fuel demand for specialised and redesigned data centres, and developers and operators who can provide sustainable computing power will reap the rewards of the data-intense digital economy. Read more latest news here.

Expereo launches an AI-driven and fully internet-based solution
Expereo has launched its Enhanced Internet service, its first AI-driven solution that continually monitors the 100,000 networks that make up the internet, and predicts the best performing route for companies’ network traffic. Empowering cloud performance Despite large investments into cloud first strategies and SaaS applications, many enterprises are still experiencing inconsistent network performance that impacts employees productivity and experience. Enhanced internet tackles this issue by improving the consistency and quality of application performance and enables businesses to get the most from their investments in cloud and SaaS strategies.  Companies typically spend approximately €500 per employee, per month, on cloud based applications. Despite this investment, issues in application performance can arise when their network traffic is impacted by latency, jitter, and poor performing routes. This can have serious implications, such as a decline in employee productivity and poor user experience, which can significantly impact the company’s financial performance. Leveraging AI to improve efficiencies Enhanced Internet’s AI capabilities means that customers can experience the agility and accessibility of the public internet, combined with the reliability and consistent performance levels typically expected from a private network or MPLS solution. Expereo’s proprietary AI software does this by combining AI with machine learning to intelligently and proactively route network traffic over the best possible path. For customers, the knowledge extends to predictive routes for specific situation by learning the most common traffic paths for their applications, and therefore, being ready to predict the path before the traffic is sent. What businesses now have at their disposal is a self-healing network that can navigate past latency and packet loss caused by network congestion. Complete visibility from a single login Paired with Expereo’s customer experience platform expereoOne, businesses can have complete visibility of the health of their network and transparency showing how Enhanced Internet has improved their application performance all from a single login. In addition to performance data, incident management, order status and invoices at site level, expereoOne displays latency and packet loss statistics, percentage of times routes have been changed and overall quality of service statistics personalised by top five cloud destinations for their business. Complete visibility of how application performance has been improved by Enhanced Internet. Sander Barens, Chief Product Officer for Expereo, comments, “Unpredictable network performance may not seem, at first, like an urgent threat to your business’ success. That is until you realise the increase demands and pressure it puts on IT teams, the damaging impact it has on business efficiency and hours of productivity downtime. With cloud applications now laying the groundwork for so many modern businesses, connecting to the public internet with the best possible application user experience is the next logical step for global businesses looking to future proof their operations. “At Expereo, we’re already using our latest AI solution, Enhanced Internet, to lift several customers into the internet age of enhanced connectivity, with very positive results and have big plans to elevate even more global enterprises into this new era of AI internet connectivity over the coming year.”

Juniper Networks unveils its AI-Native Networking Platform
Juniper Networks has announced its AI-Native Networking Platform, purpose-built to leverage AI to assure the best end-to-end operator and end-user experiences. Trained on seven years of insights and data science development, Juniper’s platform was designed from the ground up to assure that every connection is reliable, measurable and secure for every device, user, application and asset. Juniper’s platform unifies all campus, branch and data centre networking solutions with a common AI engine and Marvis Virtual Network Assistant (VNA). This enables end-to-end AI for IT Operations (AIOps) to be used for deep insight, automated troubleshooting and seamless end-to-end networking assurance, which elevates IT teams’ focus from maintaining basic network connectivity to delivering exceptional and secure end-to-end experiences for students, staff, patients, guests, customers and employees. The Juniper platform provides the simplest and most assured day zero, one, two and more operations, resulting in up to 85% lower operational expenditures than traditional solutions, demonstrates the elimination of up to 90% of network trouble tickets, 85% of IT onsite visits and up to 50% reduction in network incident resolution times. “AI is the biggest technology inflection point since the internet itself, and its ongoing impact on networking cannot be understated. At Juniper, we have seen first-hand how our game changing AIOps has saved thousands of global enterprises significant time and money while delighting the end user with a superior experience. Our AI-Native Networking Platform represents a bold new direction for Juniper, and for our industry. By extending AIOps from the end user all the way to the application, and across every network domain in between, we are taking a big step toward making network outages, trouble tickets and application downtime things of the past," says Rami Rahim, Chief Executive Officer, Juniper Networks. Within the new AI-Native Networking Platform, Juniper is introducing several new products that advance the experience-first mission, including simpler high performance data centre networks specifically designed for AI training and inference. The company is also expanding its AI Data Centre solution which consists of a spine-leaf data centre architecture with a foundation of QFX switches and PTX routers operated by Juniper Apstra. With this, Juniper takes much of the complexity out of AI Data Centre networking design, deployment and troubleshooting, allowing customers to do more with fewer IT resources. The solution also delivers unsurpassed flexibility to customers, avoiding vendor lock-in with silicon diversity, multivendor switch management and a commitment to open, standards-based ethernet fabrics.  Read more about Juniper's platform and its offerings here.

VAST Data forms strategic partnership with Genesis Cloud
VAST Data has announced a strategic partnership with Genesis Cloud. Together, VAST and Genesis Cloud aim to make AI and accelerated cloud computing more efficient, scalable and accessible to organisations across the globe. Genesis Cloud helps businesses optimise their AI training and inference pipeline by offering performance and capacity for AI projects at scale while providing enterprise-grade features. The company is using the VAST Data Platform to build a comprehensive set of AI data services in the industry. With VAST, Genesis Cloud will lead a new generation of AI initiatives and Large Language Model (LLM) development by delivering highly automated infrastructure with exceptional performance and hyperscaler efficiency. “To complement Genesis Cloud’s market-leading compute services, we needed a world-class partner at the data layer that could withstand the rigors of data-intensive AI workloads across multiple geographies,” says Dr Stefan Schiefer, CEO at Genesis Cloud. “The VAST Data Platform was the obvious choice, bringing performance, scalability and simplicity paired with rich enterprise features and functionality. Throughout our assessment, we were incredibly impressed not just with VAST’s capabilities and product roadmap, but also their enthusiasm around the opportunity for co-development on future solutions.” Key benefits for Genesis Cloud with the VAST Data Platform include: Multitenancy enabling concurrent users across public cloud: VAST allows multiple, disparate organisations to share access to the VAST DataStore, enabling Genesis Cloud to allocate orders for capacity as needed while delivering unparalleled performance. Enhancing security in cloud environments: By implementing a zero trust security strategy, the VAST Data Platform provides superior security for AI/ML and analytics workloads with Genesis Cloud customers, helping organisations achieve regulatory compliance and maintain the security of their most sensitive data in the cloud. Simplified workloads: Managing the data required to train LLMs is a complex data science process. Using the VAST Data Platform’s high performance, single tier and feature rich capabilities, Genesis Cloud is delivering data services that simplify and streamline data set preparation to better facilitate model training. Quick and easy to deploy: The intuitive design of the VAST Data Platform simplifies the complexities traditionally associated with other data management offerings, providing Genesis Cloud with a seamless and efficient deployment experience. Improved GPU utilisation: By providing fast, real-time access to data across public and private clouds, VAST eliminates data loading bottlenecks to ensure high GPU utilisation, better efficiency and ultimately lower costs to the end customer.  Future proof investment with robust enterprise features: The VAST Data Platform consolidates storage, database, and global namespace capabilities that offer unique productisation opportunities for service providers.

AI and data centres: Why AI is so resource hungry
By Ed Ansett, Founder and Chairman, i3 Solutions Group At the end of 2023, any forecast of how much energy will be required by generative AI has been inexact. Headlines tend towards guesstimates of five, 10, and 30x power needed for AI and enough power to run hundred thousands of homes and more. Meanwhile reports in specialist publications talk of power densities rising to 50kW or 100kW per rack. Why is generative AI so resource hungry? What moves are being made to calculate its potential energy cost and carbon footprint? Or as one research paper puts it, what is the 'huge computational cost of training these behemoths'. Today, much of this information is not readily available. Analysts have forecast their own estimates for specific workload scenarios, but with few disclosed numbers from the cloud hyperscalers at the forefront of model building, there is very little hard data to go on at this time. Where analysis has been conducted, the carbon cost of AI model building from training to inference has produced some sobering figures. According to a report in the Harvard Business Review, researchers have argued that training a ‘single large language deep learning model’, such as OpenAI’s GPT-4 or Google’s PaLM, is estimated to use around 300 tons of CO2. Other researchers calculated that training a medium-sized generative AI model using a technique called 'neural architecture search' used electricity and energy consumption equivalent to 626,000 tons of CO2 emissions. So, what’s going on to make AI so power hungry? Is it the data set, that is, the volume of data? The number of parameters used? The transformer model? The encoding, decoding and fine tuning? Or the processing time? The answer is, of course, a combination of all of the above. Data It is often said that GenAI Large Language Models (LLMs) and Natural Language Processing (NLP) require large amounts of training data. However, measured in terms of traditional data storage, this is not actually the case. For example, ChatGPT used commoncrawl.com data. Common Crawl says of itself that it is the primary training corpus in every LLM and that it supplied 82% of raw tokens used to train GPT-3. “We make wholesale extraction, transformation and analysis of open web data accessible to researchers. Over 250bn pages spanning 16 years. Three to five billion new pages added each month.” It is thought that ChatGPT-3 was trained on 45TB of Common Crawl plaintext, filtered down to 570GB of text data1. It is hosted on AWS for free as its contribution to Open Source AI data. But storage volumes, the billions of web pages or data tokens that are scraped from the web, Wikipedia and elsewhere, then encoded, decoded and fine-tuned to train ChatGPT and other models, should have no major impact on a data centre. Similarly, the terabytes or petabytes of data needed to train a text to speech, text to image or text to video model should put no extraordinary strain on the power and cooling systems in a data centre built for hosting IT equipment storing and processing hundreds or thousands of petabytes of data. An example of a text to image model is Large Scale AI Open Network (LAION), a German AI model with billions of images. One of its models, known as LAION 400m is a 10TB web data set. Another LAION5B has 5.85bn clip filtered text image pairs. One reason that training data volumes remain a manageable size is that it’s been the fashion amongst the majority of AI model builders to use Pre-Training Models (PTMs), instead of search models trained from scratch. Two examples of PTMs that are becoming familiar are Bidirectional Encoder Representations from Transformers (BERT) and the Generative Pre-trained Transformer (GPT) series, as in ChatGPT. Parameters Another measurement of AI training that are of interest to data centre operators are parameters. AI parameters are used by generative AI models during training. The greater the number of parameters, the greater the accuracy of the prediction of the desired outcome. ChatGPT-3 was built on 175bn parameters. But for AI, the number of parameters is already rising rapidly. WU Dao, a Chinese LLM first version used 1.75 trillion parameters. WU Dao, as well as being an LLM, is also providing text to image and text to video. Expect the numbers to continue to grow. With no hard data available, it is reasonable to surmise that the computational power required to run a model with 1.7 trillion parameters is going to be significant. As we move into more AI video generation, the data volumes and number of parameters used in models will surge. Transformers Transformers are a type of neural network architecture developed to solve the problem of sequence transduction or neural machine translation2. That means any task that transforms an input sequence to an output sequence. Transformer layers rely on loops so that where the input data moves into one transformer layer the data is looped back to its previous layer and out to the next layer. Such layers improve the predictive output of what comes next. It helps improve speech recognition, text-to-speech transformation, and more. How much is enough power? What researchers, analysts and the press are saying A report by S&P Global titled, 'POWER OF AI: Wild predictions of power demand from AI put industry on edge', quotes several sources such as: "Regarding US power demand, it's really hard to quantify how much demand is needed for things like ChatGPT," says David Groarke, Managing Director at Indigo Advisory Group. "In terms of macro numbers, by 2030 AI could account for 3% to 4% of global power demand. Google said right now AI is representing 10% to 15% of their power use or 2.3TWh annually." S&P Global continues, “Academic research conducted by Alex de Vries, a PhD candidate at the VU Amsterdam School of Business and Economics cites research by semiconductor analysis firm, SemiAnalysis. In a commentary published on 10 October in the journal, Joule, it is estimated that using generative AI, such as ChatGPT, in each Google search would require more than 500,000 of Nvidia's A100 HGX servers, totaling 4.1 million graphics processing units, or GPUs. At a power demand of 6.5kW per server, that would result in daily electricity consumption of 80GWh and annual consumption of 29.2TWh.” A calculation of the actual power used to train AI models was offered by RI.SE. It says, “Training a super-large language model like GPT-4, with 1.7 trillion parameters and using 13 trillion tokens or word snippets, is a substantial undertaking. OpenAI has revealed that it cost them $100 million and took 100 days, utilising 25,000 NVIDIA A100 GPUs. Servers with these GPUs use about 6.5kW each, resulting in an estimated 50GWh of energy usage during training.” This is important because the energy used by AI is rapidly becoming a topic of public discussion.Data centres are already on the map and ecologically focused organisations are taking note. According to 8billiontrees, “There are no published estimates as of yet for the AI industry’s total footprint, and the field of AI is exploding so rapidly that an accurate number would be nearly impossible to obtain. Looking at the carbon emissions from individual AI models is the gold standard at this time. The majority of the energy is dedicated to powering and cooling the hyperscale data centres, where all the computation occurs.” Conclusion As we wait for the numbers to emerge for past and existing power use for ML and AI, what is clear is that it is once models get into production and use, we will be in the exabyte and exaflop scale of computation. For data centre power and cooling, it is then that things become really interesting and more challenging.



Translate »