• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to secondary sidebar
  • Skip to footer

Technologies.org

Technology Trends: Follow the Money

  • Technology Events 2023-2024
  • Sponsored Post
    • Make a Contribution
  • Technology Jobs
  • Technology Markets
    • Venture Capital
  • About
  • Contact

Beyond the CPU or GPU: Why Enterprise-Scale Artificial Intelligence Requires a More Holistic Approach

May 23, 2018 By admin Leave a Comment

Industry Assembles at Intel AI DevCon; Updates Provided on Intel AI Portfolio and Intel Nervana Neural Network Processor

The following is an opinion editorial provided by Naveen Rao, vice president and general manager of the Artificial Intelligence Products Group at Intel Corporation.

This is an exciting week as we gather the brightest minds working with artificial intelligence (AI) at Intel AI DevCon, our inaugural AI developer conference. We recognize that achieving the full promise of AI isn’t something we at Intel can do alone. Rather, we need to address it together as an industry, inclusive of the developer community, academia, the software ecosystem and more.

So as I take the stage today, I am excited to do it with so many others throughout the industry. This includes developers joining us for demonstrations, research and hands-on training. We’re also joined by supporters including Google*, AWS*, Microsoft*, Novartis* and C3 IoT*. It is this breadth of collaboration that will help us collectively empower the community to deliver the hardware and software needed to innovate faster and stay nimble on the many paths to AI.

Indeed, as I think about what will help us accelerate the transition to the AI-driven future of computing, it is ensuring we deliver solutions that are both comprehensive and enterprise-scale. This means solutions that offer the largest breadth of compute, with multiple architectures supporting milliwatts to kilowatts.

Enterprise-scale AI also means embracing and extending the tools, open frameworks and infrastructure the industry has already invested in to better enable researchers to perform tasks across the variety of AI workloads. For example, AI developers are increasingly interested in programming directly to open-source frameworks versus a specific product software platform, again allowing development to occur more quickly and efficiently.

Today, our announcements will span all of these areas, along with several new partnerships that will help developers and our customers reap the benefits of AI even faster.

Expanding the Intel AI Portfolio to Address the Diversity of AI Workloads

We’ve learned from a recent Intel survey that over 50 percent of our U.S. enterprise customers are turning to existing cloud-based solutions powered by Intel® Xeon® processors for their initial AI needs. This affirms Intel’s approach of offering a broad range of enterprise-scale products – including Intel Xeon processors, Intel® Nervana™ and Intel® Movidius™ technologies, and Intel® FPGAs – to address the unique requirements of AI workloads.

One of the important updates we’re discussing today is optimizations to Intel Xeon Scalable processors. These optimizations deliver significant performance improvements on both training and inference as compared to previous generations, which is beneficial to the many companies that want to use existing infrastructure they already own to achieve the related TCO benefits along their first steps toward AI.

We are also providing several updates on our newest family of Intel® Nervana™ Neural Network Processors (NNPs). The Intel Nervana NNP has an explicit design goal to achieve high compute utilization and support true model parallelism with multichip interconnects. Our industry talks a lot about maximum theoretical performance or TOP/s numbers; however, the reality is that much of that compute is meaningless unless the architecture has a memory subsystem capable of supporting high utilization of those compute elements. Additionally, much of the industry’s published performance data uses large square matrices that aren’t generally found in real-world neural networks.

At Intel, we have focused on creating a balanced architecture for neural networks that also includes high chip-to-chip bandwidth at low latency. Initial performance benchmarks on our NNP family show strong competitive results in both utilization and interconnect. Specifics include:

General Matrix to Matrix Multiplication (GEMM) operations using A(1536, 2048) and B(2048, 1536) matrix sizes have achieved more than 96.4 percent compute utilization on a single chip1. This represents around 38 TOP/s of actual (not theoretical) performance on a single chip1. Multichip distributed GEMM operations that support model parallel training are realizing nearly linear scaling and 96.2 percent scaling efficiency2 for A(6144, 2048) and B(2048, 1536) matrix sizes – enabling multiple NNPs to be connected together and freeing us from memory constraints of other architectures.

We are measuring 89.4 percent of unidirectional chip-to-chip efficiency3 of theoretical bandwidth at less than 790ns (nanoseconds) of latency and are excited to apply this to the 2.4Tb/s (terabits per second) of high bandwidth, low-latency interconnects.

All of this is happening within a single chip total power envelope of under 210 watts. And this is just the prototype of our Intel Nervana NNP (Lake Crest) from which we are gathering feedback from our early partners.

We are building toward the first commercial NNP product offering, the Intel Nervana NNP-L1000 (Spring Crest), in 2019. We anticipate the Intel Nervana NNP-L1000 to achieve 3-4 times the training performance of our first-generation Lake Crest product. We also will support bfloat16, a numerical format being adopted industrywide for neural networks, in the Intel Nervana NNP-L1000. Over time, Intel will be extending bfloat16 support across our AI product lines, including Intel Xeon processors and Intel FPGAs. This is part of a cohesive and comprehensive strategy to bring leading AI training capabilities to our silicon portfolio.

AI for the Real World

The breadth of our portfolio has made it easy for organizations of all sizes to start their AI journey with Intel. For example, Intel is collaborating with Novartis on the use of deep neural networks to accelerate high content screening – a key element of early drug discovery. The collaboration team cut time to train image analysis models from 11 hours to 31 minutes – an improvement of greater than 20 times4.

To accelerate customer success with AI and IoT application development, Intel and C3 IoT announced a collaboration featuring an optimized AI software and hardware solution: a C3 IoT AI Appliance powered by Intel AI.

Additionally, we are working to integrate deep learning frameworks including TensorFlow*, MXNet*, Paddle Paddle*, CNTK* and ONNX* onto nGraph, a framework-neutral deep neural network (DNN) model compiler. And we’ve announced that our Intel AI Lab is open-sourcing the Natural Language Processing Library for JavaScript* that helps researchers begin their own work on NLP algorithms.

The future of computing hinges on our collective ability to deliver the solutions – the enterprise-scale solutions – that organizations can use to harness the full power of AI. We’re eager to engage with the community and our customers alike to develop and deploy this transformational technology, and we look forward to an incredible experience here at AI DevCon.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.

Source: Intel measurements on limited release Software Development Vehicle (SDV)

1 General Matrix-Matrix Multiplication (GEMM) operations; A (1536, 2048), B(2038, 1536) matrix sizes

2 Two chip vs. single chip GEMM operation performance; A (6144, 2048), B(2038, 1536) matrix sizes

3 Full chip MRB-CHIP MRB data movement using send/recv, Tensor size = (1, 32), average across 50K iterations

4 20X claim based on 21.7X speed up achieved by scaling from single node system to 8-socket cluster.

8-socket cluster node configuration: CPU: Intel® Xeon® 6148 Processor @ 2.4GHz ; Cores: 40 ; Sockets: 2 ; Hyper-threading: Enabled; Memory/node: 192GB, 2666MHz ; NIC: Intel® Omni-Path Host Fabric Interface (Intel® OP HFI); TensorFlow: v1.7.0 ; Horovod: 0.12.1 ; OpenMPI: 3.0.0 ; Cluster: ToR Switch: Intel® Omni-Path Switch

Single node configuration: CPU: Intel® Xeon® Phi Processor 7290F; 192GB DDR4 RAM; 1x 1.6TB Intel® SSD DC S3610 Series SC2BX016T4; 1x 480GB Intel® SSD DC S3520 Series SC2BB480G7; Intel® MKL 2017/DAAL/Intel Caffe

Filed Under: Tech Tagged With: CPU or GPU, artificial intelligence

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Market Analysis

DevOps and Agile: Integrating Development and Operations
China’s Use of Golden Shares: From Reducing State Role to Quietly Controlling Tech Giants
Why Amazon Acquired One Medical: Disrupting the Healthcare Industry with Technology-Enabled Care
The End of an Era: Why China is No Longer Viable as the World’s Factory
Cloudflare’s FedRAMP authorization opens up a huge market opportunity in the US government sector
Datadog’s strong financial results for the fourth quarter and fiscal year 2022 provide a solid case for a Buy recommendation
Cisco: Struggling to find new avenues of growth

Market Research Media

Agile Leadership: Leading Agile Teams for Success
Chinese Brands Reach New Heights of Popularity in the West Amid Growing Political Tensions
Unrestricted Streaming: How VPNs Can Enhance Media Consumption Experience
How the world’s biggest companies have built and defended their moats
Virtual Learning Environments: A Game-Changer in Education?
Streaming video, training, and gaming coming together to form a cohesive and diverse media and entertainment ecosystem
Turn Your Story into a Game: The Art of Gamifying Your Plot

Secondary Sidebar

Tech Events

Splash User Conference and Partner Summit will take place from April 17 to 20, 2023 in Washington, D.C.
Fiber Connect 2023, August 20-23 2023, Orlando, Florida
International Battery Seminar & Exhibit (IBSE) on March 20-23, 2023 in Orlando, FL
MIT Technology Review’s Future Compute event, April 30 – May 1, 2023, MIT Campus, Cambridge, Massachusetts
13th Annual 2023 State IT Connect Summit in Baltimore, Maryland, from March 6-8, 2023

Venture Capital

African Venture Investment Soars to Record High, Fueled by Tech Startups
Pitchly Secures $7 Million Series A Funding for SaaS-Based Data Enablement Solution
Agile Project Management: Roles and Responsibilities
Candidly, a student debt and savings optimization platform, has raised $20.5 million in a Series B round of financing
GameTech Startups Attracting Venture Capital Funding: The Importance of VPN Security

Footer

Recent Posts

  • Cloudflare Enters Fraud Detection Market with Cloudflare Fraud Detection
  • Devices utilizing Wi-Fi HaLow technology are gaining momentum
  • Agile Software Development Tools and Technologies
  • Continuous Integration and Continuous Delivery in Agile
  • NuScale Power places first long lead material production order for the manufacture of SMR
  • Potential Applications That Could Accelerate the Adoption of 6G Technology
  • The Need for 6G Technology: Challenges and Opportunities
  • Game Programmer
  • Marketing Automation Specialist
  • Unlock the Power of VPNs: Join our Workshop and Master the Latest VPN Technologies!

App Coding

How to teach yourself app coding
Building an App Without Coding Knowledge: A Guide to Non-Technical App Development
Understanding Data Structures and Algorithms
Up the Ante by Coding Your Own WordPress Plugin
Deploying Applications in Kubernetes: A Step-by-Step Guide

API Coding

Connecting to a Crypto Exchange using an API Key: A Guide
API Hub for Business Allowing Companies to Launch Custom Hubs in Minutes
How to teach yourself API coding
How to stop API breaches
GSMA Launches Open Gateway Initiative to Provide Universal Access to Mobile Network APIs for Developers

Blockchaining

The Future of Supply Chain Management with Blockchain Technology
The Role of Blockchain in Digital Identity Management
What is NFT?
Corporate leaders should embrace the potential of blockchain technology before it gets used against them
Building a Decentralized VPN Using Blockchain Technology

Event Calendar

ProMat 2023, the leading trade show for the manufacturing and supply chain industry, March 20-23 2023, Chicago’s McCormick Place
Prints 2023, Printing Conference, March 3, 2023
Milan Design Week 2023, April 17-23 2023, Milan, Italy
The New York International Auto Show, April 7-16 2023, Jacob K. Javits Convention Center
Chicago Auto Show 2024, February 10-19, 2024, McCormick Place, Chicago

Copyright © 2022 Technologies.org

Media Partners: Market Analysis & Market Research and Exclusive Domains

We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT