283: You’ve Got Re:Invent Predictions

73 / 100

SEO Score

Welcome to episode 283 of The Cloud Pod, where the forecast is always cloudy! Break out your crystal balls and shuffle those tarot decks, because it’s Re:Invent prediction time! Sorry we missed you all last week – the plague has been strong with us. But Justin and Jonathan are BACK, and we’ve got a ton of news, so buckle in and let’s get started!

Titles we almost went with this week:

🍧Not My Snowcones!
🪟Lambda at 10: Still Better Than Windows Containers

A big thanks to this week’s sponsor:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info.

General News

01:27 The voice of America Online’s “You’ve got mail” has died at age 74

Elwoods Edwards, the voice behind the online service AOL’s iconic “You’ve got mail” sound notification has died at the age of 74. He was just one day shy of his 75th birthday.
The “you’ve got mail” soundbite started in 1989 when Steve Case, CEO of Quantum Computer Services (which will later become America Online or AOL,) wanted to add a human voice to their Quantum online service.
Karen Edwards, who worked as a customer service representative, heard Case discussing the plan and suggested her husband Elwood, a professional broadcaster.
Edwards recorded the famous phrase and others (“Welcome” “File’s done” and “Goodbye” among them) on a cassette recorder in his living room.
He was paid $200 for the service.
His voice is still used to greet users of the current AOL service.

AWS

03:04 It’s Time for RE:Invent Predictions!

Matt

Large Green Computing Reinvent
LLM at the Edge
Something new On S3

Ryan (AI)

Improved serverless observability tools
Expansion of AI Driven workflows in datalakes
Greater Focus on Multi-Account or Multi-region orchestration, centralized compliance management, or enhanced security services

Jonathan

New Edge Computing Capabilities better global application deployment type features. (Cloudflare competitor maybe)
New automated cost optimization tools
Automated RAG/vector to S3

Justin

Managed Backstage or platform like service
New LLM multi-modal replacement or upgrade to Titan
Competitor VM offering to Broadcom

Honorable Mentions

Jonathan:

Deeper integration between serverless and container services

New Region

Enhanced Observability with AI driven debugging tool

Justin:

Multi Cloud management – in a bigger way (Anthos competitor)

Agentic AI toolings

New ARM graviton chip

How many times will AI or Artificial Intelligence be said:

Justin – 35

Jonathan – 72

And now it’s time for Pre:Invent announcements:

20:09 Introducing Express brokers for Amazon MSK to deliver high throughput and faster scaling for your Kafka clusters

Amazon is announcing the general availability of Express Brokers, a new broker type for Amazon Managed Streaming for Apache Kafka (MSK).
The new Express Broker is designed to deliver up to 3x more throughput per-broker, scale up to 20 times faster, and reduce recovery time by 90 percent – as compared to standard brokers running Apache Kafka.
Express Brokers come preconfigured with Kafka best practices by default.
They also support Kafka API’s and provide the same low latency performance that Amazon MSK customers expect, so they can continue using existing client applications without any changes.
Express Broker provided improved compute and storage elasticity for Kafka applications when using Amazon MSK provisioned clusters.
Some of the key features of the new express brokers include:
- Easier operations with hand-free storage management
- Fewer brokers with up to 3x throughput per broker
- Higher utilization with 20 times faster scaling
- Higher resilience with 90 percent faster recovery
Cost wise (Ohio)
- Express.m7g.4xlarge – 16 vcpu – 64gib – 3.264 per hour
- Standard Broker – 16 vcpu – 64gb – 1.632

21:10 📢 Jonathan – “it seems like would be a no-brainer if you’re running enough single brokers to meet their capacity, then switching to these as long as you maintain your redundancy would be kind of a no-brainer. I wonder what they’ve done exactly to make this new class of instances. They’re not just bigger instances, surely.”

22:13 Amazon EBS now supports detailed performance statistics on EBS volume health

Amazon is really ticking off a ton of Justin’s requests for CloudWatch!
This week, CW gets detailed performance statistics for EBS volumes. This new capability provides you with real-time visibility into the performance of your EBS volumes, making it easier to monitor the health of your storage resources and take action sooner if things go south.
You can access 11 metrics at up to per-second granularity to monitor input/output statistics of your EBS volumes, including driven I/O and I/O latency histograms.

22:44 📢 Justin – “So, you know, in the early days of auto scaling, one of the things that a lot of customers would do was they would create testing when the node would come up and they would actually test the IO throughput to the EBS volume because they were not always created equal. And so if you got a bad EBS volume, you create another one or rescale or kill that node and try again until you get one that performs to your specifications. So now, at least exposing this to you so you can actually just monitor it from CloudWatch, which is a much simpler way than running a bunch of automated tests.”

24:00 EC2 Auto Scaling introduces provisioning control on strict availability zone balance

Amazon EC2 auto scaling groups (ASG) introduce a new capability for customers to strictly balance their workloads across Availability Zones, enabling greater control over provisioning and management of EC2 instances,
Previously, if you wanted to strictly balance ASG instances across AZs, you had to override the default behavior in EC2 and invest in custom code to modify the ASG’s existing behaviors with life cycle hooks or maintain multiple ASGs.

24:24 📢 Justin – “…one of the things, if you are in a region with three zones and you want three nodes in your auto scaling group, it’ll spin up A and B and then they say C doesn’t have the capacity. It’ll just keep spinning away at C – letting you know that it’s not launching that server forever, which is just terrible. So now you at least say like look, I still want segmentation. I would still want at least two regions, but that third node can’t spin up in C. You can just put it in B or A.”

25:55 Amazon Bedrock Prompt Management is now available in GA

Amazon is announcing the GA of Amazon Bedrock Prompt Management, with new features that provide enhanced options for configuring your prompts and enabling seamless integration for invoking them in your generative AI applications.
Amazon Bedrock Prompt Management simplifies the creation, evaluation, versioning and sharing of prompts to help developers and prompt engineers get better responses from foundation models (FMs) for their use cases.

26:19 📢 Jonathan – “ Yeah, you can always ask A.I. to write a prompt for you, which has always worked really well for me. Yeah, this is kind of nice. I’ve been using Langchain in Python recently. I think it’s also available for TypeScript as well. But Langchain supports creating prompt templates, and then you can string a whole series of things together and build agents and all kinds of stuff. So it’s nice to see that they’re kind of catching up with what the open source community already has in terms of usability for this.”

27:03 AWS Snow device updates

Amazon is taking our snowcones, and reducing options for snowballs.
Effective November 12, 2024, AWS has discontinued three previous generation, end of life snowball device models; specifically the Storage optimized 80TB, Edge Compute optimized with 52vcpu, and the Compute optimized with GPU devices.
You will no longer be able to order these models, and if you have one in your environment you have one year to return the unit.
The only snowballs that will continue to be supported are the Storage optimized 210TB devices with NVME storage, and Compute Optimized with 104 vCPU with full SSD 28TB NVME for edge workloads.
If these two options don’t work for your edge computing needs, they have AWS Outpost solutions in 1U, 2U and 42U configurations.

28:11 📢 Jonathan – “It’s interesting, kind of in the hindsight, we wondered who really used these things to begin with. And maybe it was just a good idea. Maybe it was internally used and they thought other people would want to use them and there just wasn’t a market for it.”

29:57 AWS Lambda SnapStart for Python and .NET functions is now generally available

Snapstart now supports Python and .Net, coming 2 years after they introduced it for Java functions.
Lambda Snapstart caches and reuses snapshotted memory and disk state of any one-time initialization code, or code that runs only the first time the Lambda Function is invoked.
For Python functions, startup latency from initialization code can be several seconds; when you add in dependencies, this can balloon to 10+ seconds.
Snapstart can reduce latency from several seconds to as low as sub-second for these scenarios.
For .net functions, they expect most use cases to benefit because .net just-in-time compilation takes up to several seconds.
Latency variability associated with the initialization of Lambda functions has been a long-standing barrier to lambda adoption for .net use cases.

30:58 📢 Jonathan – “Wow, mean, just think of the cost saving. In usage, let alone the virtual capacity increase they’ve just got if everyone just suddenly starts using this. Even if it’s just two seconds per invocation that they’re saving, that’s two seconds they can sell to somebody else.”

31:51 AWS Lambda turns ten – looking back and looking ahead

Lambda turns 10!
As many services are now reaching this milestone, we’re not sure how much we’ll talk about these, but Lambda was a big deal when it was launched, and deserves a mention.
Jeff Barr writes that today over 1.5 million lambda users collecting makes tens of trillion function invocations per month.
Key milestones:
- 2014 – Lambda announced in preview ahead of Re:Invent with support for node.js and ability to respond to event triggers from S3 buckets, DynamoDB and Kinesis streams.
- 2015 – GA supports SNS notifications as triggers and now supports functions written in Java.
- 2016 – Python support, increased function duration to 5 minutes (it was later increased to 15 minutes), ability to access resources in a VPC, and the Serverless Application Model, as well as the launch of Step Functions.
- 2017 – Xray support
- 2018 – SQS support, Cloudformation extensions and ability to write lambda functions in any language.
- 2019 – Provisioned concurrency.
- 2020 – Savings Plan, and Private Link support, 1ms billing granularity and you can now use up to 10GB of memory and 6 CPU as well as support for container images.
- 2021 – S3 Object Lamba.
- 2022 – 10GB of temporary storage (which was controversial, if we recall.)
- 2024 – New observability capabilities with Logs, Java functions that use ARM, recursive loop and new IDE methods.
Looking ahead Jeff barr talks about the next decade of serverless, where he believes:
- Serverless will be the default choice
- Continued shift toward composability
- Automated, AI-optimized infra management
- Extensibility and integration
- Security – Threat detection and AI assisted remediation will work to make serverless apps more secure.

36:15 Centrally managing root access for customers using AWS Organizations

IAM is launching a new capability to allow security teams to centrally manage root access for member accounts in AWS organizations.
You can now easily manage root credentials and perform highly privileged actions.
Since the beginning, AWS accounts have been provisioned with highly privileged root user credentials, which had unrestricted access across the account. While powerful, it posted significant security risks.
Many customers built manual approaches to ensure MFA was enabled, regular root credential rotations and secure storage of credentials in vaults. This becomes problematic, however, as you scale into the 100’s of accounts that most enterprises run.
In addition specific root actions such as unlocking S3 bucket policies or SQS resource policies, required the root credentials.
Now with this new ability you get central management of root credentials and root sessions. Together, they offer security teams a secure, scalable and compliant way to manage root access across AWS organization member accounts.
Central management of root credentials:
- Remove long term root credentials programmatically from member accounts.
- Prevent credential recovery
- Provisioned secure-by-default accounts
- Help you stay compliant.
But sometimes you may still need the ability to do something with root, and for that they are launching root sessions:
- Secure alternative to maintaining long-term root access. Now you gain short-term, task-scoped root access to member accounts.
- Root Session benefits:
- Task scoped root access
- Centralized management
- Alignment with AWS best practices
This new capability isn’t giving you full root access, just temporary credentials to perform one of the following actions:
- Auditing root user credentials
- Re-enabling account recovery
- Deleting root user credentials
- Unlocking an S3 bucket policy
- Unlocking an SQS queue policy

39:12 📢 Jonathan – “It’s wonderful. No longer have to explain to the security team that setting the root password at some 64 character random password and then discarding it was actually a secure option, which I still think was a secure option after use.”

40:30 Introducing Amazon Route 53 Resolver DNS Firewall Advanced

Amazon must have hired someone from Azure to build this capability…
We are now getting another flavor of Route 53 resolver DNS firewall advanced, a new set of capabilities to the existing firewall that will allow you to monitor and block suspicious DNS traffic associated with advanced DNS threats, such as DNS tunneling and Domain Generation Algorithms (DGAs), that are designed to avoid detection by threat intelligence feeds or are difficult for threat intelligence feeds alone to track and block in time.

41:35 Amazon DynamoDB lowers pricing for on-demand throughput and global

tables

AWS engineering has been working on making DynamoDB more efficient, and through this they have identified and are passing along cost savings to you.
Effective November 1st, DynamoDB has reduced prices for on-demand throughput by 50% and global tables by up to 67%, making it more cost-effective than ever to build, scale, and optimize applications.
AWS points out that while provisioned capacity workloads were reasonable in the past, the new on-demand pricing benefits will result in most customers achieving a lower price with on-demand mode.
This also allows you to skip capacity planning, get automatic pricing, usage based pricing instead of capacity and the ability to scale to 0, as well as this makes it easier to adopt Serverless capabilities.

41:58 📢 Justin – “…one of the interesting things I have found in this article was that it points out that while provisioning capacity, where those were reasonable in the past, the new on-demand pricing benefit will result in most customers achieving a lower price with on-demand nodes. We’ll still meet the capacity need without having to capacity plan or do scaling of that capacity throughput. So they’re actually saying that, because of this price adjustment, the cost benefit is much better. And so you should definitely consider moving back to on-demand Dynamo DP.”

43:52 Introducing resource control policies (RCPs), a new type of authorization policy in AWS Organizations

Amazon is introducing resource control policies (RCPs) – a new authorization policy managed in AWS organizations that can be used to set the maximum available permissions on resources within your entire organization.
They are a type of preventative control that help you establish data perimeters in your AWS environment and restrict access to resources at scale.
Currently supports are in place for S3, STS, KMS, SQS, and Secrets Manager.
You might be asking what are differences between Service Control Policies and RCPs? We got you.
- SCPs limit permissions granted to principles (IAM role/users)
- RCPs limit permissions granted to resources themselves
- RCPs are evaluated when resources are accessed, regardless of who is making the API request
Some key use cases:
- Enforcing organization wide resource access controls
  - Ensure S3 buckets can only be accessed by principals within your organization
  - Prevent unauthorized external access even if developers accidentally configure overly permissive policies
Combining SCP and RCP give you an ability to set maximum allowable permissions from different angles (Principals vs resources) and used together they create a comprehensive security baseline for organizations needing strict access controls.

45:54 📢 Justin – “…it sounds boring, but then when you think about it, it’s like, this is actually really cool.”

GCP

46:31 Dataplex Automatic Discovery makes Cloud Storage data available for Analytics and governance

Ever Growing data – both structured and unstructured – continues to make it a challenge to locate the right data at the right time, and a significant portion of enterprise data remains undiscovered or underutilized, often referred as “dark data”.
To help address dark data, Google is announcing automatic discovery and cataloging of Google Cloud Storage data with Dataplex, part of BigQuery’s unified platform for intelligent data to AI governance.
- Automatically discover valuable data assets residing within cloud storage, including structured and unstructured data such as documents, files, PDFs, images and more
- Harvest and catalog metadata for your discovered assets by keeping schema definitions up to date with built-in compatibility checks and partition detection, as data evolves
- Enable analytics for data science and AI uses cases at scale with auto-created BigLake, external or object tables, eliminating the need for data duplication or manually creating table definitions.

47:41 📢 Justin – “…you know, data is the new currency. So finding your data and your organization can be somewhat a needle in the haystack; because everyone stores data where they think they need it. And then you have different enterprise systems, different SaaS applications are using… so, you know, to have a system that’s kind of inside of your environment, that’s able to automatically scan and find your data assets and then pull them into a data lake. Even if you don’t need them, that’s just incredibly valuable just for discovery.”

49:39 Shift-left your cloud compliance auditing with Audit Manager

Audit manager from Google is now generally available.
Audit manager will help you accelerate your compliance efforts by providing:
- Cleared shared responsibility outlines; including a matrix of shared responsibilities that delineates compliance duties between cloud providers and customers, offering actionable recommendations tailored to your workloads.
- Automated Compliance Assessments: Evaluation of your workloads against industry-standard technical control requirements in a simple and automated manner.
- Audit-ready evidence – Automated generation of comprehensive, verifiable evidence reports to support your compliance claims and overarching governance activity.
- Actionable Remediation Guidance.

50:56 📢 Jonathan – “I wonder if compliance auditors in general will eventually die off, not literally, but I wonder if Google or Amazon or somebody else could actually build a tool which you say, I want to be compliant with X framework will reach a point where it can be trusted enough to go and do assessments, collect data, generate reports, and then give you findings without the involvement of the PWCs or anybody else of the world.”

53:20 65,000 nodes and counting: Google Kubernetes Engine is ready for trillion-parameter AI models

For the masochists out there, you can now support up to 65,000 GKE nodes, which GKE believes is 10x more what either AWS or Azure can do,
Why would you want 65,000 nodes you might ask? Well AI of course!
That would be combined with access to things like GPU, Cloud TPU v5e node, and giving the ability to manage over 250,000 accelerators in one cluster.
Some recent GKE innovations:

53:51 📢 Justin – “You’re gonna need to communicate with your account rep before you spin up your 65,000 GKE nodes.”

Azure

55:55 Windows Server 2025 now generally available, with advanced security, improved performance, and cloud agility

WIndows Server 2025 (mentioned earlier) is now Generally Available, which also means Windows Server 2019 is now entering “end of servicing” and will reach end of Support in January 2029.
Note to listeners: As a reminder, Windows 2016 is end of support in Jan 2027.
Microsoft’s goal is to deliver a secure and high-performance windows server platform tailored to meet the diverse needs of their customers.
This release is designed to let you deploy apps in any environment, whether its on-premises, hybrid or in the cloud.
Some of the key investments areas of investment are interesting in 2025
- Advanced Multi-layered Security
  - AD – gets new security capabilities including improvements in protocols, encryption, hardening and new cryptographic support
  - File services/Message block (SMB) hardening. 2025 includes SMB over QUIC to enable secure access to file shares over the internet. SMB security also has hardened firewall defaults, brute force attack prevention and protections for man in the middle, relay and spoofing attacks.
  - Delegated Managed Service Accounts (dMSA): Unlike traditional service accounts, dMSAs don’t require manual password management since AD takes care of it. With dMSAs, specific permissions can be delegated to access resources in the domain, which reduces security risks and provides better visibility and logs of service account activity
- Cloud Agility anywhere
  - Hotpatching enabled by Azure Arc- Customers operating fully in the cloud have inherent modern security advantages like automatic software updates and back-up and recovery. And their bringing some of those cloud t hings to Windows 2025 on premise with new hotpatching subscription service, enabled by Azure Arc. With hotpatching, customers will experience fewer reboots and minimal disruption to operations.
  - Easy Azure Arc onboarding, enabling hybrid features and enhanced operational flexibility
  - SDN Multisite Feature – Software defined SDN multi-site feature offers native L2 and L3 connectivity for workload migrations across various locations, coupled with unified network policy management
  - Unified policy management allowing for centralized management of network policies, making it easier to maintain consistent security and performance standards across your hybrid cloud environment
- AI, performance and scale
  - Hyper-V, AI, Machine Learning – with built in support for GPU partitioning and the ability to process large data sets across distributed environments, Windows Server 2025 offers high-performance platform for both traditional applications and advanced AI workloads with live migration and high availability
  - NVME storage performance – Windows Server 2025 delivers up to 60% more storage IOPS performance compared to windows server 2022 on identical systems.
  - Storage Spaces Direct and storage flexibility – Windows Server supports a wide range of storage solutions such as local, NAS, and SAN for decades and continues. But Windows Server 2025 delivers more storage innovation with Native REFS deduplication and compression, thinly provisioned storage spaces, and storage replica compression now available in all editions of Windows Server 2025
  - Hyper V performance and Scale: Windows Server 2025 Hyper V can now support 240TB of memory per VM and 2048 VPs per VM.

53:51 📢 Jonathan – “Wow, that’s lot of new stuff. guess I was thinking, well, who, you know, in the cloud, they typically don’t allow virtualization anyway. So who would need all these features? Well, they need it for themselves. They need it for them. They built this, this is Windows 2025 Azure release.”

1:02:35 Enhance the security and operational capabilities of your Azure Kubernetes Service with Advanced Container Networking Services, now generally available

- Azure is announcing the general availability of Advanced Container Networking Services for Azure Kubernetes Service.
- ACNS focuses on delivering a seamless and integrated experience that allows you to maintain robust security postures and gain deep insights into your network traffic and application performance.
- This ensures that your containerized applications are not only secure but also meet your performance and reliability goals allowing you to confidently manage and scale your infrastructure.
- ACNS observability features:
  - Node-level metrics
  - Hubble Metrics, DNS and Pod level metrics
  - Hubble flow logs
  - Service Dependency Map
- ACNS Container Network Security Features:
  - FQDN filtering and security agent DNS proxy
  - Cilium Agent
  - Security Agent DNS proxy
At H&M Group, platform engineering is a core practice, supported by our cloud-native internal developer platform, which enables autonomous product teams to build and host microservices. Deep network observability and robust security are key to our success, and the Advanced Container Networking Service features help us achieve this. Real-time flow logs accelerate our ability to troubleshoot connectivity issues, while FQDN filtering ensures secure communication with trusted external domains.” — Magnus Welson, Engineering manager, container platform, H&M Group

1:05:04 Unlocking the future: Azure networking updates on security, reliability, and high availability

Several new networking updates to help with security, reliability and high availability
Security enhancements
- Bastion Developer SKu GA
- Virtual network Encryption: FPGA powered encryption for VM to VM Communication
- DNSSEC support in preview
Reliability
- ExpressRoute Metro SKU
- Maximum Resiliency (4 independent ingress paths to Azure)
- New Guided configuration for multi-site express routes
Load Balancer Improvements
- Admin Stage
- Cross Subscription Support
- Enhanced Health Status Monitoring with detailed reason codes
Scaling and Management
- Increased IP address Support: up to 1 million routable IP addresses per Virtual network
- IPAM in preview
- Virtual Network Verifier Static analysis of packet flow validation

1:07:00 Announcing the availability of Azure OpenAI Data Zones and latest updates from Azure AI

Open AI Datazones for the US and EU gives you new deployment options that provide enterprises with more flexibility and control over data privacy and residency needs.
This ensures that your data is stored and processed within specific geographic boundaries, ensuring compliance within a regional data residency requirement while maintaining optimal performance.
Azure has also enabled Prompt caching for o1-preview, o1-mini, GPT-4o and GPT-4o-mini on Azure OpenAI service.
With prompt caching, they’re giving you a 50% discount on cached input tokens on standard Azure OpenAI on standard offering and faster processing times.
Provisioned global deployment offering: They are lowering the initial deployment quantity for PT-4o model to 15 provisioned throughput until with additional increments for 5PTUs.
They are also lowering the price for Provisioned global hourly by 50% to broaden access to OpenAI Services.
Several new models are available
- Healthcare industry models include MedImageInsight, MedImageParse, CXRReportGen
- Minstral 3B from Mistral AI
- Cohere Embed 3
- Fine tuning is GA for Phi 3.5 family

1:07:52 📢 Jonathan – “Prompt caching is probably a poor name for it actually, it really isn’t. Well, it’s kind of caching the… I guess it’s caching parts of Prompt. It’s caching… it’s like not reloading tokens into memory before inference. It’s like you can reuse the same or common parts.”

1:08:57 Introducing Hyperlight: Virtual machine-based security for functions at scale

Microsoft Azure Core Upstream team is excited to announce the Hyperlight project, an open-source Rust library you can use to execute small, embedded functions using hypervisor-based protection for each function call at scale.
It can do this at a speed that enables each function request to have its own hypervisor for protection.
Hyperlight is a library to execute functions as fast as possible while isolating those functions within a VM.
Developers and software architects can use hyperlight to add serverless customizations to their applications that are able to securely run untrusted code. Hyperlight enables these for IoT gateway function embedding, high throughput cloud services and so on.
Hyperlight can create a new VM in 1-2 milliseconds. While this is still slower than using sandboxed runtimes like V8 or WasmTime directly, with Hyperlight you can take those same runtimes and place inside a VM to protect you in the event of a sandbox escape.
Hyperlight is so fast, that a one-two millisecond cold start for each VM is fast enough that it becomes practical to spin up VMs as needed in response to events. Also make it possible to scale to 0, meaning that you might not need to keep idle VM’s.
Microsoft will be submitting this CNCF.
It sounds like firecracker but is something slightly different based on comments on Hacker News.

1:10:04 📢 Jonathan – “I think it will complement Firecracker really nicely because it’s meant for function-based workloads, not VM-based workloads. so, a millisecond startup time, just… That’s almost… It’s close enough to zero to be zero compared with 125 milliseconds for a Firecracker cold start time. And to be fair, an eighth of a second to start up a VM is amazingly impressive, but…But one to two milliseconds to fire up a virtualized function that can run is just great. Wow.”

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod

Re:Invent Predictions