Announcing Amazon EC2 F1 Instances with Custom FPGAs

Please download to get full document.

View again

of 35
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report



Views: 0 | Pages: 35

Extension: PDF | Download: 0

Related documents
Amazon EC2 F1 is a new compute instance with programmable hardware for application acceleration. With F1, you can directly access custom FPGA hardware on the instance in a few clicks. Learning Objectives: • Learn about the capabilities, features, and benefits of the new F1 instances • Develop your FPGA using the F1 Hardware Developer Kit and FPGA Developer AMI • Deploy your FPGA acceleration code using F1 instances • Use F1 instances for hardware acceleration in your applications • Learn how to offer pre-packaged Amazon FPGA Machine Images (AFIs) to your customers through the AWS Marketplace
  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. David Pellerin, Business Development Principal December 15, 2016 Announcing Amazon EC2 F1 Instances with Custom FPGAs Hardware-Accelerated Computing on AWS F1
  • 2. Agenda 1. Accelerated Computing Concepts 2. Introducing F1 FPGA Instances 3. Examples of FPGA Use-Cases 4. FPGA Development Process
  • 3. Accelerated Computing on EC2
  • 4. EC2 Compute Instance Types M4 General purpose Compute optimized Storage and IO optimized GPU and FPGA accelerated Memory optimized X1 2010 2013 2016 2016 PreviewF1 P2 G2 CG1 M3 T2 I2 HS 1 I3 D2 R4 R3 C5 C4 C3 CC2 Announced
  • 5. NVIDIA Tesla GPU Card P2: GPU-accelerated computing  Enabling a high degree of parallelism – each GPU has thousands of cores  Consistent, well documented set of APIs (CUDA, OpenACC, OpenCL)  Supported by a wide variety of ISVs and open source frameworks Xilinx UltraScale+ FPGA F1: FPGA-accelerated computing  Massively parallel – each FPGA includes millions of parallel system logic cells  Flexible – no fixed instruction set, can implement wide or narrow datapaths  Programmable using available, cloud-based FPGA development tools GPU and FPGA for Accelerated Computing
  • 6. CPU: High speed, lower efficiency GPU/FPGA: High throughput, higher efficiency GPUs and FPGAs can provide massive parallelism and higher efficiency than CPUs for certain categories of applications Accelerated Computing Concepts More parallelism for higher throughout…
  • 7. A GPU is effective at processing the same set of operations in parallel – single instruction, multiple data (SIMD). A GPU has a well-defined instruction-set, and fixed word sizes – for example single, double, or half-precision integer and floating point values. An FPGA is effective at processing the same or different operations in parallel – multiple instructions, multiple data (MIMD). An FPGA does not have a predefined instruction-set, or a fixed data width. Control ALU ALU Cache DRAM ALU ALU CPU (one core) FPGA DRAM DRAM GPU Each FPGA in F1 has more than 2M of these cells Each GPU in P2 has 2880 of these cores DRAM Parallel Processing in GPUs and FPGAs BlockRAM BlockRAM DRAM DRAM
  • 8. module filter1 (clock, rst, strm_in, strm_out) for (i=0; i<NUMUNITS; i=i+1) always@(posedge clock) integer i,j; //index for loops tmp_kernel[j] = k[i*OFFSETX]; FPGA handles compute- intensive, deeply pipelined, hardware-accelerated operations CPU handles the rest application How FPGA Acceleration Works
  • 9. Process Process Process Process Process Process Process Process Process Data Data Data Data Process Process Data Hardware-Accelerated Computing Building parallel systems for parallel problems
  • 10. An FPGA is effective at processing data of many types in parallel, for example creating a complex pipeline of parallel, multistage operations on a video stream, or performing massive numbers of dependent or independent calculations for a complex financial model… An FPGA does not have an instruction-set! Data can be any bit-width (9-bit integer? No problem!) Complex control logic (such as a state machine) is easy to implement in an FPGA Each FPGA in F1 has more than 2M of these cells Parallel Processing in FPGAs
  • 11. Introducing F1 FPGA Instances
  • 12.  Make FPGA acceleration available to a larger community of developers, and to millions of potential end-customers  Provide dedicated and large amounts of FPGA logic in a single EC2 instance, using multiple FPGAs  Simplify the development process by providing cloud-based FPGA development tools  Allow developers to focus on algorithm design, by abstracting FPGA I/O using well-defined interfaces  Provide access to a growing ecosystem of FPGA programming tools and applications  Provide a Marketplace for FPGA applications, providing more choice and easy access for all AWS customers FPGA Acceleration in the AWS Cloud: Goals
  • 13.  New EC2 FPGA instance type for accelerated computing  Up to 8 Xilinx UltraScale+ 16nm VU9P FPGA devices in a single instance  The f1.16xlarge size provides:  8 FPGAs, each with over 2 million customer-accessible FPGA programmable logic cells and over 5000 programmable DSP blocks  Each of the 8 FPGAs has 4 DDR-4 interfaces, with each interface accessing a 16GiB, 72-bit wide, ECC-protected memory Instance Size FPGAs DDR-4 (GiB) FPGA Link FPGA Direct vCPUs Instance Memory (GiB) NVMe Instance Storage (GB) Network Bandwidth* f1.2xlarge 1 4 x 16 - - 8 122 1 x 480 10 Gbps Peak f1.16xlarge 8 32 x 16 Y Y 64 976 4 x 960 30 Gbps *In a placement group F1 FPGA Instance Types on AWS
  • 14. System Logic Block: Each FPGA in F1 provides over 2M of these logic blocks DSP (Math) Block: Each FPGA in F1 has more than 5000 of these blocks I/O Blocks: Used to communicate externally, for example to DDR-4, PCIe, or ring Block RAM: Each FPGA in F1 has over 60Mb of internal Block RAM, and over 230Mb of embedded UltraRAM BlockRAM BlockRAM I/O Blocks DDR-4 DDR-4 DDR-4 DDR-4 PCIe FPGALink What’s Inside the F1 FPGA?
  • 15. AWS FPGA Shell FPGA I/O is provided using pre-configured, pre-tested, and secure I/O components, allowing FPGA developers to focus on their differentiating value The FPGA Shell allows for faster coding of core acceleration functions by removing the need to develop I/O related FPGA hardware BlockRAM BlockRAMDDR-4 DDR-4 DDR-4 DDR-4 FPGALink PCIe Abstracting FPGA I/O
  • 16. Amazon Machine Image (AMI) Amazon FPGA Image (AFI) EC2 F1 Instance CPU Application on F1 DDR-4 Attached Memory DDR-4 Attached Memory DDR-4 Attached Memory DDR-4 Attached Memory DDR-4 Attached Memory DDR-4 Attached Memory DDR-4 Attached Memory DDR-4 Attached Memory FPGA Link PCIe DDR Controllers Launch Instance and Load AFI An F1 instance can have any number of AFIs An AFI can be loaded into the FPGA in less than 1 second FPGA Acceleration Using F1
  • 17. Example F1 Use-Cases
  • 18. Highly Efficient • Algorithms Implemented in Hardware • Gate-Level Circuit Design • No Instruction Set Overhead Massively Parallel • Massively Parallel Circuits • Multiple Compute Engines • Rapid FPGA Reconfigurability FPGA Speeds Analysis of Whole Human Genomes from Hours to Minutes Unprecedented Low Cost for Compute and Compressed Storage F1 for Genomics Processing
  • 19. F1 for Financial Computing Modeling Counterparty Risk (CVA) and Regulatory Capital Requirements
  • 20. F1 for Video Processing Next Generation Video Compression for Broadcast Quality 4K content Successfully ported to F1 in just 3 weeks
  • 21. F1 for Accelerated Analytics Heterogeneous Compute Acceleration for Faster Data Discovery
  • 22. FPGA Development Process
  • 23. Development steps Launch the AWS-provided FPGA Developer AMI, which includes all needed FPGA design and programming software, as well as the AWS FPGA Hardware Development Kit (HDK) Use Xilinx Vivado or SDAccel software and a hardware description language (Verilog, VHDL, or OpenCL) with the HDK to describe and simulate your custom FPGA logic After successful simulation, use Vivado or SCAccel to synthesize and place/route the FPGA logic to create an FPGA Design Check Point (DCP), encrypt, and generate an Amazon FPGA Image (AFI) Launch an F1 instance and load the AFI to the FPGA, using AFI management tools provided by AWS Developing Applications for F1 1 2 3 4
  • 24. Generate an Amazon FPGA Image (AFI) FPGA Place-and-Route using Xilinx Vivado on C4 or M4 instance FPGA Logic Design using Xilinx Vivado on C4 or M4 instance Securely deploy AFI on one or more F1 instances Developing Applications for F1
  • 25. Choose and launch the AWS-provided FPGA Developer AMI, which includes all needed FPGA design and programming software, as well as the AWS FPGA Hardware Development Kit (HDK) Developing Applications for F1
  • 26. Developing Applications for F1
  • 27. Use Xilinx Vivado or SCAccel software and a hardware description language (Verilog, VHDL, or OpenCL) with the HDK to describe and simulate your custom FPGA logic After successful simulation, use scripts provided with the HDK to encrypt, synthesize and place/route the FPGA logic to create a final FPGA Design Check Point (DCP) and generate a secure, encrypted Amazon FPGA Image (AFI) Developing Applications for F1
  • 28. Launch an F1 instance and download the AFI to the FPGA, using AFI management tools provided by AWS Generate an Amazon FPGA Image (AFI) Deploy AFI on one or more F1 instances Developing Applications for F1
  • 29. Amazon EC2 FPGA Deployment via Marketplace Amazon Machine Image (AMI) Amazon FPGA Image (AFI) AFI is secured, encrypted, dynamically loaded into the FPGA - can’t be copied or downloaded Customers AWS Marketplace Delivering FPGA Partner Solutions on AWS via AWS Marketplace
  • 30. Delivering FPGA Partner Solutions on AWS AWS Marketplace Benefits • Streamlined delivery of FPGA-accelerated solutions: Offer software as a managed Amazon Machine Image (AMI) and one or more Amazon FPGA Images (AFI), with secure 1-click purchasing. • Discover new customers: Allow customers to launch directly from AWS Marketplace, decreasing the length of sales cycles. Sellers can also offer free trials with no additional engineering effort. • Simplified billing & payments: Customers pay for AWS Marketplace software as part of the regular AWS billing cycle. AWS manages the complexity of AMI and AFI security, metering, billing, payment collection, and financial reporting. • Secure your FPGA-based products: FPGA custom logic is deployed to customers in a secure way, with no ability to view, copy, or edit the AFI logic. • Provide Seamless Product Support: AWS Marketplace Product Support Connection makes it easy to support your customers on AWS Marketplace.
  • 31. FPGA: A Field Programmable Gate Array is a device that consists of very large numbers of configurable logic and memory elements interconnected by configurable routing resources. FPGAs differ from CPUs and GPUs by having no fixed instruction set, and in their ability to implement operations and processes that are pipelined and parallelized in an almost unlimited number of ways, using arbitrarily sized bit-widths. AFI (Amazon FPGA Image): a file containing the binary image for an FPGA bitstream. Loading an AFI onto an FPGA “programs” that device, within seconds, to perform one of more application-specific functions. HDL (Hardware Description Language): a low-level programming language designed for describing logic functions for the purposed of simulation and for conversion (via synthesis) to an FPGA or ASIC. Vivado and SDAccel: a set of design tools produced by Xilinx (provider of the F1 FPGA devices) for development of FPGA logic, pre-integrated and provided at no charge by AWS. Verilog: a commonly-used HDL for FPGA design and simulation, supported by Vivado. VHDL: another commonly-used HDL for FPGA , also supported by Vivado. F1 Glossary
  • 32. OpenCL (Open Computing Language): a higher-level alternative to HDL programming based on C-language, and supported in the Xilinx SDAccel design tools. OpenCL can be used to target either FPGAs or GPUs. HDK (Hardware Development Kit): a set of tools, documentation, and associated FPGA libraries provided by AWS to assist FPGA developers with more rapid FPGA development, in particular to simplify the use of I/O from the FPGA to the host EC2 instance via PCIe, from FPGA to memory, and from FPGA to FPGA. AXI: an FPGA-internal bus format providing standardized interfaces for memory-mapped communications and for high-speed streaming data. AXI is used in the F1 HDK to define interfaces between AWS-provided interface logic, and custom logic provided by FPGA developers. Developer AMI: a preconfigured AMI, available in the AWS Marketplace, that includes all necessary software and libraries for FPGA development, including the Vivado software and the HDK libraries enabling HDL design and simulation. F1 Glossary (cont)
  • 33. Synthesis: the process, using software tools provided with Vivado, of converting an HDL or OpenCL application into a lower-level format (sometimes referred to as a “netlist”) representing the individual logic elements of the application, for example AND, OR, XOR gates, adders and multipliers, shift registers, etc. This “netlist” must be further processed, using place-and-route software, to create a downloadable bitstream. Place-and-Route: the process, using software tools provided with Vivado, of mapping individual logic elements to precise locations in the target FPGA, and specifying their interconnections. Place-and-route is an iterative process that can require hours to complete for larger applications and larger FPGAs. Bitstream: a binary format representing the synthesized, placed, and routed FPGA application ready for downloading to an FPGA. Design Check Point (DCP): a binary file format containing the FPGA bitstream, ready for ingestion during the creation of an Amazon FPGA Image (AFI). F1 Glossary (cont)
  • 34. Additional Resources AWS F1 details: AWS Marketplace: AWS Educate: Edico Genome: NGCODEC: Maxeler: Ryft:
  • 35. Thank you!
  • Recommended
    View more...
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks

    We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

    More details...

    Sign Now!

    We are very appreciated for your Prompt Action!