Artificial Intelligence Could Help Data Centers Run Far More Efficiently
August 23, 2019 | MITEstimated reading time: 6 minutes
One concern, however, is that some workload sequences are more difficult than others to process, because they have larger tasks or more complicated structures. Those will always take longer to process — and, therefore, the reward signal will always be lower — than simpler ones. But that doesn’t necessarily mean the system performed poorly: It could make good time on a challenging workload but still be slower than an easier workload. That variability in difficulty makes it challenging for the model to decide what actions are good or not.
To address that, the researchers adapted a technique called “baselining” in this context. This technique takes averages of scenarios with a large number of variables and uses those averages as a baseline to compare future results. During training, they computed a baseline for every input sequence. Then, they let the scheduler train on each workload sequence multiple times. Next, the system took the average performance across all of the decisions made for the same input workload. That average is the baseline against which the model could then compare its future decisions to determine if its decisions are good or bad. They refer to this new technique as “input-dependent baselining.”
That innovation, the researchers say, is applicable to many different computer systems. “This is general way to do reinforcement learning in environments where there’s this input process that effects environment, and you want every training event to consider one sample of that input process,” he says. “Almost all computer systems deal with environments where things are constantly changing.”
Aditya Akella, a professor of computer science at the University of Wisconsin at Madison, whose group has designed several high-performance schedulers, found the MIT system could help further improve their own policies. “Decima can go a step further and find opportunities for [scheduling] optimization that are simply too onerous to realize via manual design/tuning processes,” Akella says. “The schedulers we designed achieved significant improvements over techniques used in production in terms of application performance and cluster efficiency, but there was still a gap with the ideal improvements we could possibly achieve. Decima shows that an RL-based approach can discover [policies] that help bridge the gap further. Decima improved on our techniques by a [roughly] 30 percent, which came as a huge surprise.”
Right now, their model is trained on simulations that try to recreate incoming online traffic in real-time. Next, the researchers hope to train the model on real-time traffic, which could potentially crash the servers. So, they’re currently developing a “safety net” that will stop their system when it’s about to cause a crash. “We think of it as training wheels,” Alizadeh says. “We want this system to continuously train, but it has certain training wheels that if it goes too far we can ensure it doesn’t fall over.”
Page 2 of 2Suggested Items
I-Connect007 Editor’s Choice: Five Must-Reads for the Week
05/03/2024 | Nolan Johnson, I-Connect007This week’s most important news is strategic—and telling. When one puts together the IPC industry reports, we simply have to include the recent conversation with Shawn DuBravac and Tom Kastner. On the design side, check out the latest “On The Line With…” podcast featuring Brad Griffin from Cadence Design Systems, discussing SI and PI in the realm of intelligent system design.
Industrial PC Market Size to Record $1.75 Billion Growth from 2023-2027
05/03/2024 | PRNewswireThe global industrial pc market size is estimated to grow by USD 1.75 billion from 2023 to 2027, according to Technavio. This growth is expected to occur at a Compound Annual Growth Rate (CAGR) of almost 6.29% during the forecast period.
Real Time with… IPC APEX EXPO 2024: Sigma Engineering's Recycling and Regeneration Systems for PCB Etching
05/02/2024 | Real Time with...IPC APEX EXPOEvan Howard of Schmoll America interviews Kristoffer Bjorklund, Sigma Engineering's supply chain manager. We learn about Sigma's recycling and regeneration systems for PCB industry etching and the benefits and challenges of implementing these systems in existing factories.
Boeing T-7A Red Hawk Triples Progress
05/01/2024 | BoeingThe Boeing T-7A Red Hawk achieved three recent milestones, propelling the advanced pilot trainer for the U.S. Air Force forward.
Merlin Flex invests in New Schmoll Direct Imaging System
04/30/2024 | Merlin Flex LtdMerlin Flex has fully installed and commissioned its 2nd Schmoll MDI Direct Imaging system. This new machine includes a twin bed, 4 head system which enhances Merlin Flex’s direct imaging capability for its 1.4M long flexible circuits.