Welcome to episode 281 of The Cloud Pod, where the forecast is always cloudy! Justin and Ryan are your hosts as we search the clouds for all the latest news and info. This week we’re talking about ECS turning 10 (yes, we were there when it was announced, and yes, we’re old,) some more drama from the CrowdStrike fiasco, lots of updates to GitHub, plus more. Join us!
Titles we almost went with this week:
- 🫙Github Universe full of ECS containers
- 🛰️Github Universe lives up to the Universal expectations
A big thanks to this week’s sponsor:
We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info.
Follow Up
01:09 Dr. Matt Woods ended up at PWC as chief innovation officer
- YAWN
- What exactly does a chief innovation officer at PWC do? Is this like a semi-retirement?
General News
01:44 TSA silent on CrowdStrike’s claim Delta skipped required security update
- Delta isn’t backing down with CrowdStrike, and in a court filing said CrowdStrike should be on the hook for the entire $500M in losses, partly because CrowdStrike has admitted that it should have done more testing and staggered deployments to catch bugs.
- Delta further alleges that CrowdStrike postured as a certified best-in-class security provider who “never cuts corners,” while secretly designing its software to bypass Microsoft security certifications to make changes at the core of Delta’s computer systems without Delta’s knowledge.
- Delta says they would never have agreed to such a dangerous process if it had been disclosed.
- In its testimony to Congress, CrowdStrike said that they follow standard protocols, and that they are protecting against threats as they evolve.
- CrowdStrike is also accusing Delta of failing to follow laws, including best practices established by the TSA.
- According to CrowdStrike, most customers were up within a day of the issue – while Delta took 5 days.
- Crowdstrike alleges that Delta’s negligence caused this in following the TSA requirements designed to ensure that no major airline ever experiences prolonged system outages.
- CrowdStrike realized Delta failed to follow the requirements when its efforts to help remediate the issue revealed alleged technological shortcomings and failures to follow security best practices, including outdated IT systems, issues in Delta’s AD environment and thousands of compromised passwords.
- Delta threatened to sue Microsoft as well as CrowdStrike, but has only named CrowdStrike to date in the lawsuits.
3:48 📢 Ryan – “It’s a tool that needs to evolve very quickly to emerging threats. And while the change that was pushed through shouldn’t have gone through that particular workflow, and that’s a mistake, I do think that that should exist as part of it. Yes, could they have done better with documentation and all that? Of course.”
04:51 Google is a Leader in Gartner Magic Quadrant for Strategic Cloud Platform Services
-
- It’s Magic Quadrant time! But let’s be real – when ISN’T it MQ time.
- The Magic Quadrant is out for Cloud Platforms… and AWS is still top dog.
- BUT Microsoft and Google have moved further to the right than AWS – which is for completeness of vision.
- Oracle also made the leaders quadrant.
-
-
- Strengths
- Operational excellence
- Solutions support
- Robust Developer experience
- Cautions
- Complex and inconsistent service interfaces
- Limited traction for proprietary AI models
- Fewer Sovereign cloud options
- Strengths
-
-
-
- Strengths
- AI Infused IT Modernization
- Environmental Sustainability
- Digital Sovereignty
- Cautions
- Incomplete understanding of traditional enterprise needs
- Uneven resilience
- Distributed cloud inconsistencies
- Strengths
-
- Azure
-
- Strengths
- Cross-Microsoft Capabilities
- Industry Clouds
- Strategic partnership with OpenAI
- Cautions
- Ongoing Security Challenges
- Capacity Shortages
- Inconsistent Service and Support
- Strengths
07:04 📢 Justin – “…it’s still a shared security model. You still have requirements you have to meet. So you’re not off the hook completely by checking assured workloads for sure.”
08:12 4.2 Tbps of bad packets and a whole lot more: Cloudflare’s Q3 DDoS report
- Cloudflare gives us the 19th edition of the CloudFlare DDOS threat report.
- The number of DDoS attacks spiked in the third quarter of 2024.
- Cloudflare mitigated nearly 6 million DDOS attacks, representing a 49% increase in QoQ and 55% increase YoY.
- Out of those 6 million, Cloudflare’s autonomous DDOS defense systems detected and mitigated over 200 hyper-volumetric DDoS attacks exceeding rates of 3 terabits per second (Tbps) and 2 Billion packets per second (Bpps).
- The largest attack peaked at 4.2TB and lasted a minute.
- The Banking and Financial services industry is subjected to the most DDoS attacks.
- China was the country most targeted, and Indonesia was the largest source of attacks.
09:27 📢 Justin – “DDoS is not an IF thing. It’s a WHEN problem for every company.”
AI is Going Great – Or How ML Makes All Its Money
10:12 GitHub Copilot moves beyond OpenAI models to support Claude 3.5, Gemini
-
- In a sign of continuing ruptures between OpenAI and Microsoft (in Justin’s opinion,) Copilot will switch from being exclusively OpenAI GPT models to a multi-modal approach over the coming weeks.
- First Anthropic 3.5 Sonnet will roll out to Copilots chat web and VS Code interfaces, with Google Gemini 1.5 pro coming a short term later.
- In addition, Copilot will support gpt o1-preview and 01 mini, which are intended to be stronger at advanced reasoning than GPT-4 – which copilot has used until now.
- The new approach makes sense for users as certain models are better at certain languages or types of tasks.
- “There is no one model to rule every scenario,” wrote GitHub CEO Thomas Dohmke “It is clear the next phase of AI code generation will not only be defined by multi-model functionality, but by multi-model choice.”
11:11 📢 Ryan – “it’s very interesting that GitHub is doing that with Microsoft’s heavily involvement in OpenAI. But I also wonder if this is one of those things where the subsidiary is given a little bit more leniency, especially since it’s not really divorcing OpenAI or ChatGPT in general.”
AWS
12:32 EC2 Image Builder now supports building and testing macOS images
- MacOS is now supported in EC2 Image Builder.
- This will allow you to create and manage machine images for your macOS workloads, in addition to the existing support for Windows and Linux.
13:54 Celebrating 10 Years of Amazon ECS: Powering a Decade of Containerized Innovation
- ECS is now 10 years old!! We still remember it being announced at Re:invent in 2014… and we’ve been fans ever since.
- Its had a fun evolution:
- 2014 EC2 Container Service Launch
- 2015 ECS Autoscaling
- 2016 ALB for ECS
- 2017 AWS Fargate
- 2018 AWS Auto Scaling
- 2019 Graviton 2 support
- 2020 BottleRocket
- 2021 ECS Exec
- 2022 ECS Service connect
- 2023 Guard Duty ECS runtime support
- 2024 EBS support
16:29 📢 Justin – “Despite Kubernetes dominating the market, you know, ECS has continued to get a lot of innovation. I imagine it runs a lot of services under the hood at AWS for their use cases and how they run your services that you consume…Happy birthday, ECS. Stop getting older because I can’t be aging this fast.”
17:54 AWS announces EFA update for scalability with AI/ML applications
- AWS announces the launch of a new interface type that decouples the EFA and the ENA.
- EFA provides high bandwidth low latency networking crucial for calling AI/ML workloads.
- The new interface (EFA-only) allows you to create a standalone EFA device on secondary interfaces.
- This allows you to scale your compute clusters to run AI/ML applications without straining private Ipv4 space or encountering IP routing challenges with linux.
GCP
19:35 AI Hypercomputer software updates: Faster training and inference, a new resource hub, and more
- Google is announcing major updates to the AI Hypercomputer software layer for training and inference performance, improved resiliency at scale, as well as centralized hub for hypercomputer resources
- Centralized AI Hypercomputer Resources on GitHub:
- Launch of the AI Hypercomputer GitHub organization, a central repository for developers to access reference implementations like MaxText and MaxDiffusion, orchestration tools like xpk (Accelerated Processing Kit), and performance recipes for GPUs on Google Cloud.
- Facilitates easier discovery and contribution to AI Hypercomputer’s open-source projects.
- MaxText Now Supports A3 Mega VMs:
- MaxText, an open-source, high-performance implementation for large language models (LLMs), now optimized for A3 Mega VMs powered by NVIDIA H100 Tensor Core GPUs.
- Offers a 2x improvement in GPU-to-GPU network bandwidth over A3 VMs.
- Collaboration with NVIDIA to optimize JAX and XLA for overlapping communication and computation on GPUs.
- Introduction of FP8 mixed-precision training using Accurate Quantized Training (AQT), delivering up to 55% improvement in effective model FLOPS utilization compared to bf16 precision.
- Reference Implementations and Kernels for Mixture of Experts (MoE):
- Expansion of MaxText to include both “capped” and “no-cap” MoE implementations, providing flexibility between predictable performance and dynamic resource allocation.
- Open-sourcing of Pallas kernels optimized for block-sparse matrix multiplication on Cloud TPUs, compatible with PyTorch and JAX, enhancing MoE model training performance.
- Monitoring Large-Scale Training:
- Introduction of a reference monitoring recipe to create a Cloud Monitoring dashboard in Google Cloud projects.
- Enables tracking of metrics like CPU utilization and identification of outliers, simplifying MLOps for large-scale training jobs.
- SparseCore on Cloud TPU v5p Now Generally Available:
- SparseCore, a hardware accelerator for embeddings on Cloud TPU v5p, is now generally available.
- Each TPU v5p chip includes four SparseCores, delivering up to 2.5x performance improvement for models like DLRM-V2 compared to previous generations.
- Enhances performance for recommender systems and models relying on embeddings.
- Improved LLM Inference Performance:
- Introduction of KV cache quantization and ragged attention kernels in JetStream, an open-source, optimized engine for LLM inference.
- These enhancements improve inference performance by up to 2x on Cloud TPU v5e.
21:02 📢 Ryan – “it really does show how much the IEI branding is taking over everything. Because a lot of these things were the same things we were talking about for machine learning.”
21:44 BigQuery’s AI-assisted data preparation is now in preview
- Now in preview, BigQuery data preparation provides a number of capabilities:
- AI-powered suggestions: BigQuery data preparation uses Gemini in BigQuery to analyze your data and schema and provide intelligent suggestions for cleaning, transforming, and enriching the data. This significantly reduces the time and effort required for manual data preparation tasks.
- Data cleansing and standardization: Easily identify and rectify inconsistencies, missing values, and formatting errors in your data.
- Visual data pipelines: The intuitive, low-code visual interface helps both technical and non-technical users easily design complex data pipelines, and leverage BigQuery’s rich and extensible SQL capabilities.
- Data pipeline orchestration: Automate the execution and monitoring of your data pipelines. The SQL generated by BigQuery data preparation can become part of a Dataform data engineering pipeline that you can deploy and orchestrate with CI/CD, for a shared development experience.
22:12 📢 Justin – “What could go wrong with low code complex data pipeline?”
23:21 Google Cloud Apigee named a Leader in the 2024 Gartner® Magic Quadrant™ for API Management
- It’s amazing how many companies are in this quadrant but don’t feel like real API gateways..
24:29 📢 Justin – “Amazon web services though, being a very, very good at ability to execute, but not a completeness of vision. they’re in the challenger quadrant, speaks volumes about how little innovation API gateway has gotten.”
Azure
25:42 What Microsoft’s financial disclosures reveal about Azure’s market position
- Microsoft will now change the way it reports some Azure metrics to the stock market in their upcoming earnings call (Which we’ll cover next week.)
- MS said the change will align Azure with consumption revenue and by inference more closely aligning how AWS reports its metrics.
- The account change removed slower growth revenue streams and raised the growth rates for azure.
- It also increased the AI contribution within Azure.
- Removed services:
- EMS (Enterprise Mobility and Security) and Power BI
27:17 Azure at GitHub Universe: New tools to help simplify AI app development
- Github Copilot for Azure now in Preview, integrating the tools you use your IDE and Azure.
- You can now use @azure, giving you personalized guidance to learn about services and tools without leaving your code.
- This can accelerate and streamline development by provisioning and deploying resources through Azure Developer CLI templates.
- AI App Templates further accelerate your development by helping you get started faster and simplifying evaluation and the path to production.
- Using an AI App template directly in your preferred IDE such as Github codespaces, vs code and visual studio.
- You can even get recommendations for specific templates right from Github Copilot for Azure based on your AI use case or scenario.
- Github Models now in preview to give you access to Azure AI’s leading model garden.
- Keeping Java apps up to date can be time consuming, and to help they are giving you Github CoPilot upgrade assistant for Java to offer an approach using AI to simplify this process and allowing you to upgrade your java apps with minimal manual effort.
- Scale AI applications with Azure AI evaluation and online A/B experimentation using CI/CD workflows
28:37📢 Ryan – “I like all of these, but I really don’t like that they’re keeping the Java apps up to date. Like, they’re just furthering the life of that terrible, terrible language. And one of the things is that they abstract all these simple things away, but it’s like, that’s why I hate it. It shouldn’t exist. It’s terrible. And newer languages have moved on.”
29:21 New from Universe 2024: Get the latest previews and releases
- AI-Native = Github Copilot Workspace + Code Review + Copilot Autofix to allow you to rapidly refine, validate and land Copilot-generated code suggestions from copilot code review, copilot autofix and third party copilot extensions.
- Github Spark is a new way to start ideas. It’s powered by natural language and it sets the stage for github’s vision to help 1 billion people become developers.
- With live history, previews and the ability to edit code directly, Github Spark allows you to create microapps that take that crazy small, fun idea and bring it to life.
- Raising the quality of Copilot power experiences, they have added new features such as multi-modal choice, improved code completion, implicit agent selection in github copilot chat, better support for C++ and .Net and expanded availability in Xcode and Windows Terminal.
- You can now edit multiple lines and files with copilot in VSCode, applying edits directly as you iterate on your codebase with natural language.
- Github Copilot code reviews provide copilot powered feedback on your code as soon as you create a pull request.
- This means no more waiting for hours to start the feedback loop. Configure rules for your team and keep quality high with the help of your trusted AI pair programmer. Now supporting C#, Java, Javascript, Python, Typescript, Ruby, Go and Markdown.
- Github Copilot extensions allow you or your organization to integrate proprietary tools directly into your IDE via the github marketplace.
- Some that we saw in the marketplace were Docker for Github Copilot, Teams toolkit for Github Copilot. Atlassian, New Relic etc.
- For the EU, you now get Data residency for Github Enterprise Cloud.
- Github Issues got further improvements with sub issues, issue types, advanced search and increased project item limits
28:37📢 Ryan – “I do like adding the code reviews and feedback ability to GitHub. I think that’s a fantastic thing just to have built in. I hope that that allows some of the finding nine different people to validate my PRs to make sure I can go to production, go away, but we’ll see, doubt it.”
34:06 Accelerate scale with Azure OpenAI Service Provisioned offering
- Azure OpenAI Service Data Zones allows enterprises to scale AI workloads while maintaining compliance with regional data residency requirements.
- It offers flexible, multi-regional data processing within selected data boundaries, eliminating the need to manage multiple resources across regions.
- 99% Latency SLA for Token Generation: Ensures faster and more consistent token generation speeds, especially at high volumes, providing predictable performance for mission-critical applications.
- Reduced Pricing and Lower Deployment Minimums:
- Hourly pricing for Provisioned Global deployments reduced from $2.00 to $1.00 per hour.
- Deployment minimums for Provisioned Global reduced by 70%, and scaling increments reduced by up to 90%, lowering the barrier for businesses to start using the Provisioned offering.
- Prompt Caching: Offers a significant cost and performance advantage by caching repetitive API requests. Cached tokens are discounted by 50% for the Standard offering.
- Simplified Token Throughput Information: Provides a clear view of input and output tokens per minute for each Provisioned deployment, eliminating the need for detailed conversion tables or calculators.
35:36📢 Justin – “I implemented Claude and my VS code, and when I ask it questions now it tells me how many tokens I used, which has been really helpful to like learn how many tokens and how much that does cost me. You know, especially when you’re paying by the drip now, like I have Claude subscription as well. And that one, just paid 20 bucks a month and I see the value of just paying 20 bucks a month if you’re doing a lot of heavy duty stuff, but if you need to integrate an app, you have to use API’s and that’s where the tokens really kill you.”
36:04 Announcing AzAPI 2.0
- AzAPI provider, designed to expedite the integration of new Azure services with Hashicorp Terraform, has now released 2.0. This updated version marks a significant step in their goal to provide launch day support for azure services using terraform
- Key Features of the AzAPI include
- Resource Specific versioning allowing users to switch to a new API version without altering provider versions
- Special functions like azapi_update_resource and azapi_resource_action
- Immediate day 0 support for new services.
- Also, all resource properties, outputs and state representation are now handled by Hashicorp configuration language instead of JSON
37:15📢 Justin – “I kind of like the idea of it though, because, you know, if you, if you change the API for the service and now you have to roll a whole brand new provider, you have to maintain a lot of branches of providers. Cause if you push, you know, to a new provider that has different syntax, like that could be a breaking change. So this allows you to take advantage of a newer API without the breaking change potentially.”
38:31 Announcing Azure OpenAI Global Batch General availability: At scale processing with 50% less cost!
- GA of Azure OpenAI global batch offering, designed to handle large-scale and high-volume processing tasks efficiently. Process asynchronous groups of requests with separate quota, a 24 hour turnaround and 50% less cost than global standard.
- Why Azure OpenAI Global Batch?
- Benefit 50% lower costs, enabling you to either introduce new workloads or run existing workloads more frequently, thereby increasing overall business value.
- Efficiently handle large-scale workloads that would be impractical to process in real-time, significantly reducing processing times.
- Minimize engineering overhead for job management with a high resource quota, allowing you to queue and process gigabytes of data with ease. Substantially high quotas for batch.
Oracle
40:09 Create a multi cloud data platform with a converged database
- Oracle Autonomous Database will be available across all major cloud service providers (hyperscalers) by 2025, including Oracle Cloud Infrastructure (OCI), Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure.
- Introduction of Oracle’s Converged Database Solution: A single database that manages all data types (structured, unstructured, graph, geospatial, vectors) and can be deployed across private data centers and all major cloud platforms.
- New Features:
- Deployment Across Multiple Clouds:
- Oracle Autonomous Database on OCI: Offers features like automated security measures, continuous monitoring, and scalability without rearchitecting applications.
- Integration with AWS: Strategic partnership enabling deeper analytical insights by combining Oracle Database services with AWS Analytics for near-real-time analytics and machine learning without complex data pipelines.
- Oracle Database@Azure: Availability of Oracle Database services within Azure data centers, allowing seamless integration with native Microsoft Azure services for high performance and low latency.
- Oracle Database@Google Cloud: Integration of Oracle technologies into Google Cloud, providing services like Oracle Exadata Database Service and Oracle Autonomous Database, fully integrated into Google Cloud networking.
- Converged Database Capabilities:
- Unified Data Management: Handles multiple data types within a single database system, reducing the need for multiple specialized databases.
- Compliance with Data Residency Regulations: Ensures minimal data replication and consistent data management across geographies to meet stringent regulatory requirements.
- Deployment Across Multiple Clouds:
41:58📢 Justin – “And it’s kind of interesting, but I can think of really interesting data warehouse use cases. could see some interesting, you know, different global replication needs that you might have that this could be really handy. And so if you’re already sending all the money to Oracle, why not take advantage of something like this? If it makes sense for your solution.”
42:33 Oracle Cloud Migrations can now migrate AWS EC2 VM instances to OCI
- Oracle now natively will migrate your EC2 VM to ZOCI.
- This fully managed toolset provides you with complete control over the migration workflow while simplifying and automating the process, including:
- Automatically discovering VMs in your source environment
- Creating and managing an inventory with OCI of the resource identified in the source environment.
- Providing compatibility assessments, metrics, recommendations and cost comparisons
- Creating plans and simplify the deployment of migration targets in OCI
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod