AWS offers a broad assortment of EC2 instance types for different compute workloads. You can build a cloud architecture using instance types with different combinations of CPU, memory, storage, and networking capacity that correspond to your application needs. Instances can be sized to match your workloads to help you scale your cloud deployment over time. We reviewed all EC2 instance types in an earlier article. In this post, we will focus on accelerated computing instances to help you choose the right AWS resources for maximizing cloud performance at the lowest cost.
AWS accelerated computing instances are well-suited for what applications?
Accelerated computing instances are designed for applications that require high processing capability. They provide access to hardware-based compute accelerators, or co-processors such as Graphics Processing Units (GPUs), or Field Programmable Gate Arrays (FPGAs). These instances perform certain functions, such as floating point number calculations, graphics processing, and data pattern matching, more efficiently than those running on CPUs. They enable more parallel processing for higher throughput.
GPU-based instances provide access to NVIDIA GPUs with thousands of compute cores.
FPGA-based instances provide access to large FPGAs with millions of parallel system logic cells.
Instance performance
All accelerated computing instances, except Inf1, are EBS-optimized by default, enabling consistently high performance for EBS volumes by eliminating any conflict between Amazon EBS I/O and other network traffic.
What do you need to run GPU-based instances?
To access the GPU, these instances must have the appropriate NVIDIA driver installed on Linux. The main types of NVIDIA drivers that can be used with GPU instances are Tesla, GRID and Gaming drivers.
- Tesla drivers are primarily designed for compute workloads, and are supported by G4, G3, P3 and P2 instances
- GRID drivers are designed for professional visualization applications that render content, such as 3D models or high-resolution videos, and are supported by G4, G3, G2 and P3 instances
- Gaming drivers are supported only by G4 instances
What can you do to optimize GPU settings?
What can you do to optimize GPU settinThe NVIDIA driver comes with an autoboost feature by default, which adjusts GPU clock speeds. To consistently reach maximum GPU performance, you can disable the autoboost feature and manually configure GPU clock speeds to their maximum frequency.
P3 and P2 Instances
P3 instances are the latest generation of general purpose instances that accelerate machine learning and high performance computing applications with powerful GPUs.
Hardware Specifications
Use Cases
- Machine/Deep learning
- High performance computing
- Computational fluid dynamics
- Computational finance
- Seismic analysis
- Speech recognition
- Autonomous vehicles
- Molecular modeling
- Drug discovery
Featuring up to eight latest-generation NVIDIA V100 Tensor Core GPUs, p3 instances reduce machine learning training time from days to minutes for data scientists, researchers, and developers.
Inf1 Instances
Inf1 instances are designed for machine learning inference applications. They use AWS Inferentia chips, custom designed to enable low latency inference performance at any scale.
Hardware Specifications
Use Cases
- High performance computing
- Computational fluid dynamics
- Computational finance
- Seismic analysis
- Speech recognition
- Autonomous vehicles
- Drug discovery
- Recommendation engines
- Forecasting
- Image and video analysis
- Advanced text analytics
- Document analysis
- Voice, conversational agents, translation, transcription
- Fraud detection
G4 and G3 Instances
These are GPU-based instances designed for building and running graphics-intensive applications at a low cost. They provide access to NVIDIA Tesla GPUs to accelerate scientific, engineering, and rendering applications by leveraging the CUDA or Open Computing Language (OpenCL) parallel computing frameworks, graphics applications using DirectX or OpenGL. G4 instances support NVIDIA GRID Virtual Workstation.
Hardware Specifications
Use Cases
- Machine learning inference for applications like adding metadata to an image, object detection, recommender systems, automated speech recognition, and language translation
- Graphics-intensive applications, such as remote graphics workstations, video transcoding
- 3D computer animation, modeling, simulation, and rendering, photo-realistic design, Autodesk Maya or 3D Studio Max
- Game streaming, 3-D application streaming in the cloud
What is the difference between G4 and G3 instances?
G4 instances offer up to 1.8x increase in graphics performance and up to 2x video transcoding over G3 instances, and deliver better price/performance for inference and for small-scale and entry-level machine learning training workloads.
F1 Instances
F1 instances use FPGAs to deliver customizable hardware acceleration for computationally intensive algorithms, such as data-flow or highly parallel operations, that are not supported by general purpose CPUs. Developers can use the FPGA Developer AMI and AWS Hardware Developer Kit to accelerate hardware with F1 instances.
FPGA-based instances are only available on Linux, and do not support Microsoft Windows.
Hardware Specifications
Use Cases
- Genomics research
- Financial analytics
- Real-time video processing
- Big data search and analysis
- Large Throughput Image Processing
- Network and Security
Accelerated computing instances pack a lot of power to run your graphics-intensive applications on AWS. Parquantix monitors EC2 usage in real time to ensure maximum instance utilization.
Contact us for a 30-minute consultation to find out how you can optimize your applications with Parquantix.
Optimizing AWS Spend for EC2 Accelerated Computing Instances
Ongoing cost management is vital to optimizing your AWS architecture. As your computing needs increase your usage costs can quickly add up. To control spend you can utilize AWS pricing models that provide significant discounts for volume purchases of instances.
Reserved Instances
As your production environments steadily grow you will benefit from considerable savings through Reserved Instances. If you reserve a set number of accelerated computing instances for 1 or 3 years, you will receive savings of up to 60% over on-demand rates. Managing Reserved Instances though is complicated and time-consuming. You need to dedicated time and effort to continuously monitor your workloads, apply reservations to actual usage. If you don’t keep on top of it your savings can be quickly reversed.
The cost optimization tool by Parquantix monitors usage hourly. It resizes and allocates reservations to match usage. It captures volume discounts and sells unused RIs in the AWS Marketplace to increase savings. The AI-driven algorithms adjust your RI mix and allocation based on changes in your workloads and business needs. The automated tool will relieve the complexity and time commitment so that you can focus on building and running your applications on AWS.
Savings Plans
Savings Plans offer significant discounts for a commitment to a specific dollar amount of instance usage over a 1 or 3 year period. They are available not just for EC2 instances, but also for other compute resources such as AWS Fargate and Lambda.
Reserved Instances consistently provide higher savings than Savings Plans, yet your application requirements will dictate which pricing model is better for your usage needs. For a detailed comparison of the two pricing plans view our previous article.
The table below illustrates the estimated savings on accelerated computing instances through these volume-based pricing plans. A p3.xlarge running on Linux in US East (N. Virginia) can be purchased for substantial savings over on-demand rates.
Optimize Your AWS Spend with an Automated Tool
Building and running your cloud architecture with the appropriate mix of accelerated computing instance types requires a thorough consideration of all pricing options. Parquantix will advise a cloud cost management strategy based on your application and business needs. We will execute the strategy to guarantees the best instance architecture that delivers the highest value for your applications in the cloud.
Are you ready to start optimizing your applications with accelerated computing instances? Contact us to schedule a 30-minute consultation.