273: Phi-fi-fo-fum, I Smell the Bones of The Cloud Pod Hosts

Cloud Pod Header
tcp.fm
273: Phi-fi-fo-fum, I Smell the Bones of The Cloud Pod Hosts
Loading
/
79 / 100

Welcome to episode 273 of The Cloud Pod, where the forecast is always cloudy! Hold onto your butts – this week your hosts Justin, Ryan, Matthew and (eventually) Jonathan are bringing you two weeks worth of cloud and AI news. We’ve got Karpenter, Kubernetes, and Secrets, plus news from OpenAI, MFA changes that are going to be super fun for Matthew, and Azure Phi. Get comfy – it’s going to be a doozy!

Titles we almost went with this week:

  • 🐪The Cloud Pod Teaches Azure-normalized Camel Casing
  • 🧳The Cloud Pod Travels to Malaysia
  • ⚖️Azure Detaches Itself From its Own Scale Sets
  • ✍️The Cloud Pod Conditionally Writes Show Notes 
  • 🏫You got MFA!
  • ⛔The Cloud Pod Delays Deleting Itself
  • 🎤The Cloud Pod is Now the Cloud Pod Podcast!

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info. 

General News

01:37 Terraform AzureRM provider 4.0 adds provider-defined functions 

  • Terraform is announcing the GA of Terraform AzureRM provider 4.0.  The new version improves the extensibility and flexibility in the provider. 
  • Since the Providers’ Last major release in March 2022, Hashi has added support for some 340 resources and 120 data sources, bringing the total Azure resources to 1,101 resources and almost 360 data sources. 
  • The provider has topped 660M downloads, MS and Hashi continue to develop new, innovative integrations that further ease the cloud adoption journey to enterprise organizations. 
  • With Terraform 1.8, providers can implement custom functions that you can call from the Terraform configuration. The new provider adds two Azure-specific provider functions to let users correct the casing of their resource IDs or access the individual components of it. 
  • Previously, the Azure RM provider took an all-or-nothing approach to Azure resource provider registration, where the Terraform provider would either attempt to register a fixed set of 68 providers upon initialization or registration or be skipped. 
  • This didn’t match Microsoft’s recommendations, which are to register resource providers only as needed, and to enable the services you’re actively using. 
  • With adding two new feature flags, resource_provider_registrations and resource_providers_to_register, users now have more control over which providers to register automatically or whether to continue managing a subscription resources provider. 
  • AzureRM has removed a number of deprecated items, and it is recommended that you look at the removed resources/data sources and the 4.0 upgrade guide.

03:50 📢 Justin – “Okay, so it doesn’t have anything really to do with Terraform. It has to do with Azure and enabling and disabling resource types that they can monkey with, basically, with configuration code.”

06:12 Rackspace Goes All In – Again – On OpenStack 

  • Rackspace hasn’t been very vocal about OpenStack – which they launched in 2010 – out of a collaboration between NASA and Rackspace. 
  • Rackspace didn’t turn their back per say, contributing over 5.6M lines of code to it, and it is one of the largest OpenStack cloud providers. 
  • In recent years, however, they have withdrawn to some extent from commitments to OpenStack
  • Recently they reaffirmed their commitment, with the launch of OpenStack Enterprise, a fully managed cloud offering aimed at critical workloads that run at scale and that brings enhanced security and efficiency. 
  • The only thing we can think is… you wanted to make an alternative to VMWare. Got it. Good luck. 

07:35 📢 Ryan – “I think there should be something like OpenStack for, you know, being able to run your own hardware and, know, still get a lot of the benefits of compute in a cloud ecosystem, hardware that you control and ecosystems that maybe you don’t want being managed by a third party vendor. So happy to see OpenStack continue to gain support even though I haven’t touched it in years.”

AWS

08:39 Announcing Karpenter 1.0

  • Karpenter is an open source K8 cluster autoscaling project, created by AWS. 
  • The project has been adopted for mission-critical use cases by industry leaders.
  • It’s been adding key features over the years, like workload consolidation, disruption controls and more. 
  • Now it has reached 1.0, and is no longer considered beta by AWS.
  • This new release includes the Stable Karpenter API’s NodePool and EC2NodeClass.
  • As part of this release, the custom resource definition (CRD) API groups and kind name remain unchanged.  
  • AWS has also created conversion webhooks to make migrating from beta to stable more seamless.
  • Karpenter V1 adds support for disruption budgets by reason. 
  • The supported reasons are Underutilized, Empty and Drifted. 
    • This will enable the user to have finer-grained control of the disruption budgets that apply to specific disruption reasons. 

09:28 📢 Ryan – “See, this is how I know Kubernetes is too complex. I feel like every other week there’s some sort of announcement of some other project that controls like the allocation of resources or the scaling of resources or the something something of pods. And I’m just like, okay, cool.”

11:26  Add macOS to your continuous integration pipelines with AWS CodeBuild

  • What took you so long? 
  • Now, you can build applications on MacOS with AWS CodeBuild.  
  • You can build artifacts on managed Apple M2 machines that run MacOS 14 Sonoma
  • AWS Codebuild is a fully managed continuous integration service that compiles source code, runs tests, and produces ready-to-deploy software packages. 
  • CodeBuild for MacOS is based on a recently introduced reserved capacity fleet containing instances powered by Ec2 but maintained by CodeBuild.  
    • With reserved capacity fleets, you configure a set of dedicated instances for your build environment. 
    • These machines remain idle, ready to process builds or tests immediately, which reduces build durations. 
  • Codebuild provides a standard disk image to your build. It contains pre-installed versions of Xcode, Fastlane, Ruby, Python and Nodej, as well as codebuild manages autoscaling of the fleet. 
  • CodeBuild for macOS works with reserved fleets. 
  • Contrary to on-demand fleets, where you pay per minute of build, reserved fleets are charged for the time the build machines are reserved for your exclusive usage, even when no builds are running. 
  • The capacity reservation follows the Amazon EC2 Mac 24-hour minimum allocation period, as required by the Software License Agreement for macOS (article 3.A.ii).

09:28 📢 Justin- “You’re not spin up, so the key thing is that you don’t wanna spin up additional Mac OS’s every time you wanna do this because then you’re paying for every one of those for 24 hours. So because you have a reserved fleet, you’re using the same Mac OS that’s in the fleet and you don’t have to worry about auto scaling it up and down.”

15:00 Announcing general availability of Amazon EC2 G6e instances

  • AWS announced the general availability of EC2 G6e instances powered by NVIDIA L40S Tensor Core GPUs. 
  • G6e instances can be used for a wide range of ML and Spatial computing use cases. 
  • G6e instances deliver up to 2.5x better performance compared to G5 instances and up to 20% lower inference costs than p4d instances.
  • Customers can use G6e instances to deploy LLMs with up to 13B parameters and diffusion models for generating images, video and audio. 
  • G6e instances feature up to 8 NVIDIA L40s Tensor Core GPUs with 384 GB of GPU memory (48GB per GPU) and third generation AMD EPYC processors.  192vCPUs, 400Gbps of network bandwidth, up to 1.536 TB of system memory and up to 7.6 TB of NVMe SSD storage. 

15:56 📢 Ryan – “My initial reaction was like, got to figure out like a modern workload where I care about these types of specs on these specific servers. And then I remember I provide cloud platforms to the rest of the business and I go, no, this is going to be expensive. How am I going to justify all this… pass.”

16:56 Now open — AWS Asia Pacific (Malaysia) Region 

  • The AWS Malaysia region with three Availability Zones is now Open, with the API name of ap-southeast-5
  • This is the first infra region in Malaysia, and the 13th in Asia Pacific joining Hong Kong, Hyderabad, Jakarta, Melbourne, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo and China. 
  • The new AWS region will support the Malaysian Government’s strategic Madani economic framework.
    • The initiative aims to improve the living standards for all Malaysians by 2023 while supporting innovation in Malaysia and across ASEAN. 
    • The new region will add about 12.1 B to Malaysia’s GDP and will support more than 3,500 full-time jobs at external businesses throughout 2038. 

15:56 📢 Justin – “The forecast models all die at 2038. We didn’t really understand why. We just assumed that’s when the jobs run out. No, no, that’s a different problem.”

19:52 CloudFormation simplifies resource discovery and template review in the IaC Generator

  • AWS Cloudformation now includes enhancements to the IaC generator, which customers use to create IaC from existing resources. 
  • Now, after the IaC generator finishes scanning the resources in an account, it presents a graphical summary of the different resource types to help customers find the resources they want to include in their template more quickly. 
  • After selecting resources, customers can preview their template in AWS application composer, visualizing the entire application architecture with the resources and their relationships. 

20:20 📢 Ryan- “This is how I do all of my deployment architectures. Now I just deploy everything and then I generate the picture, screenshot that and then document. Ta -da!”

21:19 Amazon DocumentDB (with MongoDB Compatibility) Global Clusters introduces Failover

  • DocumentDB now supports global cluster failover, a fully managed experience for performing a cross-region failover to respond to unplanned events such as regional outages.  
  • With Global Cluster Failover, you can convert a secondary region into the new primary region in typically a minute and also maintain the multi-region global cluster configuration. 
  • An Amazon DocumentDB Global Cluster is a single cluster that can span up to 6 AWS regions, enabling DR from region wide outages and low latency global reads.
  • Combined with Global Cluster Switchover, you can easily promote a secondary region to primary for both planned and unplanned events.  
    • Switchover is a managed failover experience meant for planned events such as regional rotations. 

22:25 📢 Ryan – “I mean, anytime you can do this type of like a DR and failover at the data layer, I’m, I’m in love with, because it’s so difficult to orchestrate on your own. And so that’s a huge value from using a cloud provider. Like I would like to just click some boxes and make, and it will just work. Awesome.“

22:46 Amazon S3 now supports conditional writes

  • S3 adds support for conditional writes that check for the existence of an object before creating it. 
  • This allows you to prevent applications from overwriting existing objects when uploading data. 
    • You can perform conditional writes using putobject or completemultipartupload API requests in both general-purpose and directory buckets. 
  • This makes it easier for distributed applications with multiple clients concurrently updating data in parallel across shared datasets.  
  • This allows you to no longer write client side consensus mechanisms to coordinate updates or use additional API requests to check for the presence of an object before uploading data. 

23:28 📢 Justin – “…either you would have to do an API call to verify if the file was there before, which you’re not paying for, and then you can do your write, or you get to do this. And if you have all your apps trying to do this all at the same time, the milliseconds of latency can kill you on this type of thing. So having the ability is very nice.”

25:10 AWS Lambda now supports function-level configuration for recursive loop detection

  • AWS Lambda now supports function-level configuration which allows you to disable or enable recursive loop detection. 
  • Lambda recursive loop detection, enabled by default, is a preventative guard rail that automatically detects and stops recursive invocations between Lambda and other supported services, preventing runaway workloads. 
  • Customers running intentionally recursive patterns could turn off recursive loop detection on a per account basis through support. Now customers can disable or enable recursive loop detection on a per function basis, allowing them to run their intentionally recursive workflows while protecting the remaining functions in their account from runaway workloads caused by unintended loops. 

25:44 📢 Justin – “I remember when they first added this several years ago, we were like, this is amazing. Thank God they finally did this. But then I forgot about the support part that you had to reach out to support if you didn’t want your attention to your cursive pattern. And I, if I was going to go down that path, I’d just say, don’t – I’ve done something wrong. But, apparently if I think I’m actually right – which is a problem, I think I’m right all the time – it can now cost myself some money. So do be careful with this feature. It’s a gun that can shoot you in the foot very quickly.”

GCP

27:58 Looker opens semantic layer via new SQL Interface and connectors for Tableau & others

  • Google says that Data is the driving force of innovation in business, especially in the world of accelerating AI adoption. But data driven organizations struggle with inconsistent or unreliable metrics. Without a single source of truth for data definitions, metrics can have a different logic depending on what tool or team they come from. 
  • Teams that can’t trust data go back to their gut, a risky strategy. 
  • Google designed Looker with a semantic model to let you define metrics once and use them everywhere, for better governance, security and overall trust in your data. 
  • So to live up to that vision, they are releasing BI connectors, including GA of their custom-built connector for Tableau, which will make it easier to use Looker’s metrics layer within the broader ecosystem of SQL based tools, with an integration layer for lookerML models based on BigQuery, plus connectors for popular products. 
  • This integration layer is the OpenSQL Interface and gives Looker customers more options for how they deploy governed analytics.  
  • They are also releasing a general purpose JDBC driver for connecting the interface, and partners including thoughtspot, mode and APOS systems have already integrated their products with Looker’s semantic layer. 
  • The connectors for Looker now include:
    • Google Sheets
    • Looker Studio
    • Power BI
    • Tableau
    • Thoughtspot
    • Mode
    • APOS Systems
    • Custom JDBC

29:48 📢 Ryan- “…these types of connectors and stuff offer great amount of flexibility because these BI tools are so complex that people sort of develop their favorite and don’t want to use another one.”

31:10 C4 VMs now GA: Unmatched performance and control for your enterprise workloads

  • Google is pleased to release the GA of the C4 Machine series, the most performant general-purpose VM for Compute Engine and GKE customers. 
  • C4 VM’s are engineered from the ground up and fine-tuned to deliver industry-leading performance, with up to 20% better price-performance for general-purpose workloads, and 45% better price performance for CPU based inference versus comparable GA VMs from other hyperscalers. 
  • Together with the N4 machines, C4 VMs provide the performance and flexibility you need to handle the majority of workloads, all powered by Google’s Titanium
  • With Titanium offload technology, C4 provides high performance connectivity with up 20 Gbps of networking bandwidth and scalable storage with up to 500k iops and 10GB throughput on Hyperdisk Extreme
  • C4 instances scale up to 192vCPU and 1.5TB of DDR5 memory and feature the latest generation performance with Intel’s 5th Gen XEON processors. 

32:42 📢 Matthew – “…the specs on this is outstanding. Like the 20 gigabytes of networking, like they really put a lot into this and it really feels like it’s going to be a good workhorse for people in the future.”

33:19 Containers & Kubernetes Your infrastructure resources, your way, with new GKE custom compute class API

  • Google is launching a new custom compute class API in GKE.  
  • Imagine that your sales platform is working great, and despite surging demand, your K8 infrastructure is seamlessly adapting to handle the traffic.  
  • GKE cluster autoscaler is intelligently selecting the best resources from a range of options you’ve defined. 
  • No pages for being out of resources, or capacity issues. All powered by the custom compute class API.
  • Google is providing you fine-grained control over our infrastructure choices, GKE can now prioritize and utilize a variety of compute and accelerator options based on specific needs ensuring that your apps, including AI workloads, always have the resources they need to thrive. 
  • GKE custom compute classes maximize obtainability and reliability by providing fall-back compute priorities as a list of candidate node characteristics or statically defined node pools. 
  • This increases the chances of successful autoscaling while giving you control over the resources that get spun up. If your first priority resource is unable to scale up, GKE will automatically try the second priority node selection, and then continue to other lower priorities on the list. 
    • For example, n2d is preferred, falls back to c2d, then n2d, and then a nodepool. 
  • Scaling events for top-priority nodes may not be available without custom compute classes, so pods land on lower-priority instances and require manual intervention, but with active migration for workloads to preferential node shape is available. 

34:51 📢 Ryan – “Kubernetes is really complicated, huh?”

38:50 📢 Matthew – “I do want to point out that they had to say in this article – because this article has absolutely nothing to do with AI in any way shape or form, but it includes AI workloads because for some reason it wouldn’t have been known. and I actually checked the article because I saw it in the note or show notes, but I literally had to go into the article to be like why is that commentary necessary? Did somebody miss their AI quota for the day so they just threw it in?”

40:21 Introducing delayed destruction for Secret Manager, a new way to protect 

your secrets   

  • Destroying your secrets just got a lot safer with the new delayed destruction of secret versions for Secrets Manager
  • This new capability helps to ensure that secret material cannot be erroneously deleted—either by accident or as part of an intended malicious attack. 
  • While managing secrets and secret versions was possible before, it had some challenges/risks.  
  • Destruction of a secret version is an irreversible step, meaning there is no way to recover your secret once destroyed – nor was there actionable alerting if there was an attempt to destroy any of your critical secrets, which reduces the chance of timely intervention from an administrator. 
  • With the customizable delay duration, you can prevent immediate destruction of secret versions as well as fire a new pub/sub event notification that alerts you when a destroy action is attempted. 

41:13 📢 Ryan – “I mean, this is a good feature. AWS has it by default from the, from the rollout where there’s, takes seven days for a secret to actually go away and you can restore it up until then. The monitoring is the bigger one for me, like being able to configure a notification without trying to like, you know, scout through all the API logs for the delete secret API method. So this is nice. I like that.”

44:09 Run your AI inference applications on Cloud Run with NVIDIA GPUs

  • You can now run your AI Inference jobs on Cloud Run with NVIDIA GPUs.  
  • This allows you to perform real-time inference with lightweight open models such as Gemma 2B/7B or Meta Llama 3 (8B) or your own custom models. 

44:33 📢 Ryan – “No, I mean, this is a great example of how to use serverless in the right way, right? These scales down, you’re doing lightweight transactions on those inference jobs. And then you’re not running dedicated hardware or maintaining an environment, which, you know, basically means that you keep warm.”

45:08  Cloud Functions is now Cloud Run functions — event-driven programming in one unified serverless platform

  • Cloud Functions is now Cloud Run Functions, which is stupid.  This goes beyond a simple name change, though, as they have unified cloud function infrastructure with cloud run, and the developers of cloud function 2nd gen get immediate access to all new cloud run features, including NVIDIA GPUs. 
  • In addition, Google Cloud Function Gen customers have access to all cloud run capabilities, including:
    • Multi-event triggers
    • High-performance direct VPC egress
    • Ability to mount cloud storage volumes (So Justin can run SQL ♥️) 
    • Google Managed language run times
    • Traffic splitting
    • Managed Prometheus and OpenTelemetry
    • Inference Functions with NVIDIA GPUS

46:56 📢 Justin – “Yeah, I started to wonder why you would just use Cloud Run. Unless you’re getting some automation with Cloud Run functions that I’m not familiar enough with. But the fact that you get all the Cloud Run benefits with Cloud Functions, and if I get some advantage using functions, I guess it’s a win.”

47:57 What’s New in Assured Workloads: Enable updates and new control packages

  • Compliance isn’t a one time job, and so Google is releasing several updates to Assured Workloads which helps your organization meet compliance requirements. 
  • Compliance Updates feature, allows you to evaluate if your current assured workloads folder configuration differs from the latest available configuration, and can enable you to upgrade previously created AW folders to the latest. 
  • Expanded regional controls with Assured workloads now in over 30 regions and 20 countries. 
  • Regional controls now support over 50 of the most popular Google Cloud Services (45% more than the year prior)
  • And they now have over 100 new fedramp high authorized services including Vertex AI, Cloud Build and Cloud Run, Cloud Filestore, as well as powerful security controls on their secure by design, secure by default cloud platform such as VPC Service Controls, Cloud Armor, Cloud Load Balancing and reCaptcha. 

48:36 📢 Justin – “Which means AI is coming to the government.” 

50:22 Try the new Managed Service for Apache Kafka and take cluster management off your todo list  

  • Running distributed event processing and storage systems like Apache Kafka can push your ops team to the bring.  There are tons of ways to secure, network and autoscale your clusters. But Google is pleased to now offer you a shortcut with the new Google Cloud Managed Service for Apache Kafka.  This service takes care of the high-stakes, sometimes tedious work of running infra.  This is an alternative to cloud pub/sub. 
  • You can have Kafka clusters in 10 different VPC networks. 

51:13 📢 Justin – “There was no mention about region support, which is really what I need out of this service, versus in region support. But if they can make this multi -region over time, I’m sort of in on this one.”

52:57 Announcing Terraform Google Provider 6.0.0: More Flexibility, Better Control

  • Like Azure, Google is also getting a new provider – the 6.00 is now GA, the combined Hashicorp/Google provider team has listened closely to the feedback from customers. 
  • Some of the key notable (but somehow also not very notable) changes
    • Opt-out default label “goog-terraform-provisioned” (which isn’t helpful)
      • As a follow up to the addition of provider level default labels in 5.16, this now gives an opt out of the default label.  The tag was added automatically to anything created by the terraform provider.  Previously you had to opt in to the label, now you have to opt out. 
    • Deletion protection fields added to multiple resources
      • Google_domain, google_cloud_run_v2_job, google_cloud_run_v2_service, google_folder and google_project. (Which should have delete protection before this, but what do we know.)   
    • Allows reducing the suffix length in “name_prefix”.  
    • The max length for the user defined name prefix has increased from 37 characters to 54. 
    • There is an upgrade guide available and I’m sure more will be coming out. 

Azure

55:19 Elevate your AI deployments more efficiently with new deployment and cost management solutions for Azure OpenAI Service including self-service Provisioned

  • Azure OpenAI Service, designed to help their 60,000 plus customers manage their AI deployments is announcing significant updates to make AI more cost efficient and effective. 
  • So What’s New?
    • Self Service Provisioning and Model independent quota requests allowing you to request Provisioned Throughput Units (PTUs) more flexibly and efficiently. This new feature empowers you to manage your Azure OpenAI Service quota deployments independently without relying on support from your account team.  
    • By decoupling quota requests from specific models, you can now allocate resources based on your immediate needs and adjust as your requirements evolve.
    • Visibility to service capacity and availability.  Now know in real-time about service capacity in different regions, ensuring that you plan and manage your deployment effectively. 
    • Provisioned hourly pricing and reservations
      • Hourly no-commit purchasing
      • Monthly and yearly azure reservations for provisioned deployments

56:22 📢 Matthew – “These are, while they sound crazy, extremely useful because as soon as like, was it 4.0 came out, we had to go like. Boy them because otherwise we were worried we were locked out of the region. So even though we weren’t using them yet, our accounting was like, make sure you deploy them as soon as you see the announcement that may or may not be coming out in a very, in the next couple of days and, and do the units that you’re going to need for production, even though you, didn’t know what we needed yet.”

58:32 Announcing General Availability of Attach & Detach of Virtual Machines on Virtual Machine Scale Sets

  • Azure is thrilled to announce you can attach or detach VMs to and from a Virtual Machine Scale Set (VMSS) with no downtime is GA. This functionality is available for scale sets with Flexible Orchestration Mode with a Fault Domain Count of 1
  • Benefits:
    • Let Azure do the work
    • Easy to Scale
    • No Downtime
    • Isolated Troubleshooting
    • Easily Move VMs
  • And yes – Azure is thrilled. That’s in the announcement. Really. 

59:10 📢 Matthew – “And this is only for flexible, so if you’re not using flexible, which has other issues already with it, like and you are you have to be in a fault counts, you actually have more than capacity than you need. So there’s very specific ways that you can leverage this.”

1:04:29 Announcing mandatory multi-factor authentication for Azure sign-in

1:06:18 📢 Matthew – “Or you just run your worker nodes inside and use the, whatever they call it, service principal to, which is like an IAM role to handle the authentication for you, which definitely works great with Atlantis.”

1:06:47 Boost your AI with Azure’s new Phi model, streamlined RAG, and custom generative AI models 

  • Azure is announcing several updates to help developers quickly create AI solutions with greater choice and flexibility leveraging the Azure AI toolchain:
    • Improvements to the Phi family of models, including a new Mixture of Experts (MoE) model and 20+ languages
    • AI21 Jamba 1.5 Large and Jamba 1.5 on Azure AI models as a service
    • Integrated vectorization in Azure AI search to create a streamlined retrieval augmented generation (RAG) pipeline with the integrated data prep and embedding
    • Custom generative extraction model in Azure AI Document Intelligence, so you can now extract custom fields for unstructured documents with high accuracy. 
    • The GA of Text to speech Avatar, a capability of Azure AI speech service, which brings natural-sounding voices and photorealistic avatars to life, across diverse languages and voices, enhancing customer engagement and overall experience
    • GA of VS Code extension for Azure Virtual Machine Learning.
    • The GA of Conversational PII detection Service in Azure AI Language

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.