• Skip to main content
  • Skip to secondary menu
  • Skip to footer

Technologies.org

Technology Trends: Follow the Money

  • Technology Events 2026-2027
  • Sponsored Post
  • Technology Markets
  • About
    • GDPR
  • Contact

AWS Announces General Availability of Amazon EC2 P4d Instances with EC2 UltraClusters Capability

November 2, 2020 By admin Leave a Comment

Next-generation accelerated computing instances powered by NVIDIA A100 Tensor Core GPUs and AWS petabit-scale networking provide up to 3x faster time to train and 60% lower cost than previous generation instances for machine learning training and high-performance computing in the cloud

GE Healthcare, Toyota Research Institute, and Aon among customers using P4d instances

Today, Amazon Web Services, Inc. (AWS), an Amazon.com company (NASDAQ: AMZN), announced the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P4d instances, the next generation of GPU-powered instances delivering 3x faster performance, up to 60% lower cost, and 2.5x more GPU memory for machine learning training and high-performance computing (HPC) workloads when compared to previous generation P3 instances. P4d instances feature eight NVIDIA A100 Tensor Core GPUs and 400 Gbps of network bandwidth (16x more than P3 instances). Using P4d instances with AWS’s Elastic Fabric Adapter (EFA) and NVIDIA GPUDirect RDMA (remote direct memory access), customers are able to create P4d instances with EC2 UltraClusters capability. With EC2 UltraClusters, customers can scale P4d instances to over 4,000 A100 GPUs (2x as many as any other cloud provider) by making use of AWS-designed non-blocking petabit-scale networking infrastructure integrated with Amazon FSx for Lustre high performance storage, offering on-demand access to supercomputing-class performance to accelerate machine learning training and HPC. To get started with P4d instances visit: https://aws.amazon.com/ec2/instance-types/p4

Data scientists and engineers are continuing to push the boundaries of machine learning by creating larger and more-complex models that provide higher prediction accuracy for a broad range of use cases, including perception model training for autonomous vehicles, natural language processing, image classification, object detection, and predictive analytics. Training these complex models against large volumes of data is a very compute, network, and storage intensive task and often takes days or weeks. Customers not only want to cut down on the time-to-train their models, but they also want to lower their overall spend on training. Collectively, long training times and high costs limit how frequently customers can train their models, which translates into a slower pace of development and innovation for machine learning.

The increased performance of P4d instances speeds up the time to train machine learning models by up to 3x (reducing training time from days to hours) and the additional GPU memory helps customers train larger, more complex models. As data becomes more abundant, customers are training models with millions and sometimes billions of parameters, like those used for natural language processing for document summarization and question answering, object detection and classification for autonomous vehicles, image classification for large-scale content moderation, recommendation engines for e-commerce websites, and ranking algorithms for intelligent search engines—all of which require increasing network throughput and GPU memory. P4d instances feature 8 NVIDIA A100 Tensor Core GPUs capable of up to 2.5 petaflops of mixed-precision performance and 320 GB of high bandwidth GPU memory in one EC2 instance. P4d instances are the first in the industry to offer 400 Gbps network bandwidth with Elastic Fabric Adapter (EFA) and NVIDIA GPUDirect RDMA network interfaces to enable direct communication between GPUs across servers for lower latency and higher scaling efficiency, helping to unblock scaling bottlenecks across multi-node distributed workloads. Each P4d instance also offers 96 Intel Xeon Scalable (Cascade Lake) vCPUs, 1.1 TB of system memory, and 8 TB of local NVMe storage to reduce single node training times. By more than doubling the performance of previous generation of P3 instances, P4d instances can lower the cost to train machine learning models by up to 60%, providing customers greater efficiency over expensive and inflexible on-premises systems. HPC customers will also benefit from P4d’s increased processing performance and GPU memory for demanding workloads like seismic analysis, drug discovery, DNA sequencing, materials science, and financial and insurance risk modeling.

P4d instances are also built on the AWS Nitro System, AWS-designed hardware and software that has enabled AWS to deliver an ever-broadening selection of EC2 instances and configurations to customers, while offering performance that is indistinguishable from bare metal, providing fast storage and networking, and ensuring more secure multi-tenancy. P4d instances offload networking functions to dedicated Nitro Cards that accelerate data transfer between multiple P4d instances. Nitro Cards also enable EFA and GPUDirect, which allows for direct cross-server communication between GPUs, facilitating lower latency and better scaling performance across EC2 UltraClusters of P4d instances. These Nitro-powered capabilities make it possible for customers to launch P4d in EC2 UltraClusters with on-demand and scalable access to over 4,000 GPUs for supercomputer-class performance.

“The pace at which our customers have used AWS services to build, train, and deploy machine learning applications has been extraordinary. At the same time, we have heard from those customers that they want an even lower cost way to train their massive machine learning models,” said Dave Brown, Vice President, EC2, AWS. “Now, with EC2 UltraClusters of P4d instances powered by NVIDIA’s latest A100 GPUs and petabit-scale networking, we’re making supercomputing-class performance available to virtually everyone, while reducing the time to train machine learning models by 3x, and lowering the cost to train by up to 60% compared to previous generation instances.”

Customers can run containerized applications on P4d instances with AWS Deep Learning Containers with libraries for Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS). For a more fully managed experience, customers can use P4d instances via Amazon SageMaker, providing developers and data scientists with the ability to build, train, and deploy machine learning models quickly. HPC customers can leverage AWS Batch and AWS ParallelCluster with P4d instances to help orchestrate jobs and clusters efficiently. P4d instances support all major machine learning frameworks, including TensorFlow, PyTorch, and Apache MXNet, giving customers the flexibility to choose the framework that works best for their applications. P4d instances are available in US East (N. Virginia) and US West (Oregon), with availability planned for additional regions soon. P4d instances can be purchased as On-Demand, with Savings Plans, with Reserved Instances, or as Spot Instances.

GE Healthcare is the $16.7 billion healthcare business of GE. As a leading global medical technology and digital solutions innovator, GE Healthcare enables clinicians to make faster, more informed decisions through intelligent devices, data analytics, applications and services, supported by its Edison intelligence platform. “At GE Healthcare, we provide clinicians with tools that help them aggregate data, apply AI and analytics to that data and uncover insights that improve patient outcomes, drive efficiency and eliminate errors,” said Karley Yoder, VP & GM, Artificial Intelligence, at GE Healthcare. “Our medical imaging devices generate massive amounts of data that need to be processed by our data scientists. With previous GPU clusters, it would take days to train complex AI models, such as Progressive GANs, for simulations and view the results. Using the new P4d instances reduced processing time from days to hours. We saw two- to three-times greater speed on training models with various image sizes, while achieving better performance with increased batch size and higher productivity with a faster model development cycle.”

Toyota Research Institute (TRI), founded in 2015, is working to develop automated driving, robotics, and other human amplification technology for Toyota. “At TRI, we’re working to build a future where everyone has the freedom to move,” said Mike Garrison, Technical Lead, Infrastructure Engineering at TRI. “The previous generation P3 instances helped us reduce our time to train machine learning models from days to hours and we are looking forward to utilizing P4d instances, as the additional GPU memory and more efficient float formats will allow our machine learning team to train with more complex models at an even faster speed.”

Aon is a leading global professional services firm providing a broad range of risk, retirement and health solutions. Aon PathWise is a GPU-based and scalable HPC risk management solution that insurers and re-insurers, banks, and pension funds can use to address today’s key challenges such as hedge strategy testing, regulatory and economic forecasting, and budgeting. “Aon PathWise allows (re)insurers and pension funds to access next generation technology to rapidly solve today’s key insurance challenges such as hedge strategy testing, regulatory and economic forecasting, and budgeting,” said Peter Phillips, President and CEO, PathWise. “Through the use of AWS P4d instances with 2.5 petaflops of mixed-precision performance, we are able to deliver a two-fold reduction in cost to our customers without loss of performance, and can deliver a 2.5x improvement in speed for the most demanding calculations. Speed matters and we continue to delight our customers thanks to the new instances from AWS.”

Comprised of radiology and AI experts, Rad AI builds products that maximize radiologist productivity, ultimately making healthcare more widely accessible and improving patient outcomes. “At Rad AI, our mission is to increase access to and quality of healthcare, for everyone. With a focus on medical imaging workflow, Rad AI saves radiologists time, reduces burnout, and enhances accuracy,” said Doktor Gurson, Co-founder of Rad AI. “We use AI to automate radiology workflows and help streamline radiology reporting. With the new EC2 P4d instances, we’ve seen faster inference and the ability to train models 2.4x faster, with higher accuracy than on previous generation P3 instances. This allows faster, more accurate diagnosis, and greater access to high quality radiology services provided by our customers across the US.”

OmniSci is a pioneer in accelerated analytics. The OmniSci platform is used in business and government to find insights in data beyond the limits of mainstream analytics tools. “At OmniSci, we’re working to build a future where data science and analytics converge to break down and fuse data silos. Customers are leveraging their massive amounts of data that may include location and time to build a full picture of not only what is happening, but when and where through granular visualization of spatial temporal data. Our technology enables seeing both the forest and the trees,” said Ray Falcione, VP of US Public Sector, at OmniSci. “Through the use of P4d instances, we were able reduce the cost to deploy our platform significantly compared to previous generation GPU instances thus enabling us to cost-effectively scale massive data sets. The networking improvements on A100 has increased our efficiencies in how we scale to billions of rows of data and enabled our customers to glean insights even faster.”

Zenotech Ltd is redefining engineering online through the use of HPC Clouds delivering on demand licensing models together with extreme performance benefits by leveraging GPUs. “At Zenotech we are developing the tools to enable designers to create more efficient and environmentally friendly products. We work across industries and our tools provide greater product performance insight through the use of large scale simulation,” said Jamil Appa, Director and Co-Founder, Zenotech. “The use of P4d instances enables us to reduce our simulation runtime by 65% compared to the previous generation of GPUs. This speed up cuts our time to solve significantly allowing our customers to get designs to market faster or to do higher fidelity simulations than were previously possible.”

About Amazon Web Services

For 14 years, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud platform. AWS offers over 175 fully featured services for compute, storage, databases, networking, analytics, robotics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, virtual and augmented reality (VR and AR), media, and application development, deployment, and management from 77 Availability Zones (AZs) within 24 geographic regions, with announced plans for 12 more Availability Zones and four more AWS Regions in Indonesia, Japan, Spain, and Switzerland. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs. To learn more about AWS, visit aws.amazon.com.

About Amazon

Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Fire tablets, Fire TV, Amazon Echo, and Alexa are some of the products and services pioneered by Amazon. For more information, visit amazon.com/about and follow @AmazonNews.

Filed Under: Tech

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Footer

Recent Posts

  • Anthropic’s Stainless Acquisition Is an Infrastructure Seizure Disguised as a Developer Tools Deal
  • Blackstone and Google Are Building an AI Infrastructure Giant Outside the Traditional Cloud Model
  • Mind Robotics Crosses $1B in Total Funding; Rivian Is the Quiet Disclosure
  • Quantum Motion Raises $160 Million Series C to Scale Silicon-Based Quantum Computing
  • Fazeshift Raises $17 Million Series A to Automate Accounts Receivable With Autonomous AI Agents
  • Instant Power Becomes the Next AI Infrastructure Battleground as Nyobolt Raises $60 Million
  • NVIDIA and Corning Expand U.S. Optical Manufacturing for AI Infrastructure
  • QuantWare Raises $178 Million Series B, Announces 10,000-Qubit Processor Architecture
  • Panthalassa Raises $140 Million to Power AI Computing with Ocean Waves
  • JEDEC Advances DDR5 MRDIMM Architecture With New MDB Standard and Next-Gen Memory Roadmap

Media Partners

  • Market Analysis
  • Cybersecurity Market
  • App Coding
The Productivity Is Already Here. The Bubble Narrative Is Not.
The Collingridge Dilemma
Why Memory Prices Won’t Come Down
The Bill Comes Due
The Software-Defined Camera Won. The Open OS Did Not.
Cars Are Computers Now, and Most Carmakers Aren’t
Gartner: Global IT Spending to Hit $6.31 Trillion in 2026, Driven by AI Infrastructure
The SDK Generator Benchmarks: Infrastructure vs. Convenience
Infographic: We Are Likely in the Early Stages of Another Productivity Boom
Infographic: Establishing the National Multimodal Freight Network
Salt Typhoon, Volt Typhoon, Flax Typhoon: China’s 2024 Campaign Against U.S. Infrastructure
Foreign Criminal Cyberattacks Against the United States: Ransomware, Botnets, and Financial Fraud
Iran’s Cyber Operations: Infrastructure Attacks, Election Interference, and IRGC Proxies
North Korea’s Cyber Program: From Sony to Blockchain Theft
Russia’s State Cyber Operations: From SolarWinds to Logistics Warfare
China’s Cyber Campaigns Against the United States: Two Decades of Documented Operations
How the U.S. Government Attributes Cyberattacks — and Why It Is Harder Than It Looks
Thirteen Years of Cyberattacks Against the United States: The CRS Record
Billington Critical Infrastructure CyberSecurity Summit, Nov. 17–18, 2026, San Antonio, Texas
ShinyHunters Breaches Canvas LMS, Threatening Data on 275 Million Users
DigitalOcean Launches AI-Native Cloud at Deploy 2026
Verdent Updates AI Platform to Function as a Full Engineering Team for Solo Builders
The Side Project App Is Not Dead. The Side Project App Business Is.
The App Monetization Landscape Has Changed and Most Teams Have Not Caught Up
Building Offline-First Mobile Apps Is Harder Than It Looks and Worth It
State Management in React Native Has Too Many Options and One Right Answer
Mobile Accessibility Is the Case Developers Keep Ignoring
Testing Mobile Apps at Scale Without Losing Your Mind
App Store Optimization in 2026 Is a Different Game Than It Was
Cross-Platform vs Native: The Honest Assessment Nobody Gives You

Media Partners

  • Market Research Media
  • Technology Conferences
  • API Coding
China’s U.S. Treasury Holdings: The Great Repositioning (2021–2025)
Infographic: Why the 2025 CIPA Data Proves the APS-C Renaissance is Real
How WiFi Changed Media
Canva Acquires Simtheory and Ortto to Build End-to-End Work Platform
Netflix Price Hikes, The Economics of Dominance in a Saturated Streaming Market
America’s Brands Keep Winning Even as America Itself Slips
Kioxia’s Storage Gambit: Flash Steps Into the AI Memory Hierarchy
Mamdani Strangling New York
The Rise of Faceless Creators: Picsart Launches Persona and Storyline for AI Character-Driven Content
Apple TV Arrives on The Roku Channel, Expanding the Streaming Platform Wars
D.A. Davidson Technology Conference, June 11, 2026, Nashville
Bank of America Global Technology Conference, June 4, 2026, San Francisco
William Blair Growth Stock Conference, June 3, 2026, Chicago
TD Cowen Technology, Media & Telecom Conference, May 27, 2026, New York
J.P. Morgan Global Technology, Media and Communications Conference, May 18–20, 2026, Boston
Technology Investor Conference Circuit, May–June 2026
Automate 2026 Sets Its Agenda Around AI’s Role in Industrial Transformation, June 22–25, 2026, McCormick Place in Chicago
IBM Think 2026, May 5–8, Boston, Massachusetts, USA
AI & Creativity Summit New York 2026, May 14, The Lighthouse Brooklyn
SEMICON Southeast Asia 2026, May 5–7, Kuala Lumpur
Why Private Domain Data Is the Real Key to AI That Actually Works
Orkes Raises $60M to Bring Production-Grade AI Orchestration to Enterprise Developers
Form.io Launches MCP Server and Agentic Coding Toolset for Governed Enterprise AI Development
Appdome Upgrades MobileBOT Defense With Identity-First Mobile API Protection
Five SDK Generators Compared: Speakeasy, Stainless, Fern, APIMatic, and OpenAPI Generator
API Monetization Models That Work and the Ones That Drive Developers Away
gRPC in Production: What the Documentation Doesn't Tell You
Event-Driven Architecture vs Request-Response: Choosing the Right Communication Pattern
The Business Case for Internal APIs That Most Engineering Leaders Ignore
Breaking Changes: How to Avoid Shipping Them and What to Do When You Must

Copyright © 2026 Technologies.org

Media Partners: Market Analysis · Market Research · Referently · Photography