
Welcome to episode 290 of The Cloud Pod – where the forecast is always cloudy! It’s a full house this week – and a good thing too, since there’s a lot of news! Justin, Jonathan, Ryan, and Matthew are all in the house to bring you news on DeepSeek, OpenVox, CloudWatch, and more.
Titles we almost went with this week:
- ☁️The cloud pod wonders if azure is still hung over from new years
- 🦈Stratoshark sends the Cloud pod to the stratosphere
- 🏮Cutting-Edge Chinese “Reasoning” Model Rivals OpenAI… and it’s FREE?!
- 🧓Wireshark turns 27, Cloud Pod Hosts feel old
- ☠️Operator: DeepSeek is here to kill OpenAI
- 💸Time for a deepthink on buying all that Nvidia stock
- 🪙AWS Token Service finally goes cloud native
- 📰The CloudPod wonders if OpenAI’s Operator can order its own $200 subscription
A big thanks to this week’s sponsor:
We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info.
AI IS Going Great – Or How ML Makes All Its Money
01:29 Introducing the GenAI Platform: Simplifying AI Development for All
- If you’re struggling to find that AI GPU capacity, Digital Ocean is pleased to announce their DigitalOcean GenAI Platform is now available to everyone.
- The platform aims to democratize AI development, empowering everyone – from solo developers to large teams – to leverage the transformative potential of generative AI.
- On the Gen AI platform you can:
- Build Scalable AI Agents
- Seamlessly integrate with workflows
- Leverage guardrails
- Optimize Efficiency.
- Some of the use cases they are highlighting are chatbots, e-commerce assistance, support automation, business insights, AI-Driven CRMs, Personalized Learning and interactive tools.
02:23 📢 Jonathan – “Inference cost is really the big driver there. So once you once you build something that’s that’s done, but it’s nice to see somebody focusing on delivering it as a service rather than, you know, a $50 an hour compute for training models. This is right where they need to be.”
04:21 OpenAI: Introducing Operator
- We have thoughts about the name of this service…
- OpenAI is releasing the preview version of their agent that can use a web browser to perform tasks for you.
- The new version is available to OpenAI pro users.
- OpenAI says it’s currently a research preview, meaning it has limitations and will evolve based on your feedback.
- Operator can handle various browser tasks such as filling out forms, ordering groceries, and even creating memes.
- The ability to use the same interfaces and tools that humans interact with on a daily basis broadens the utility of AI, helping people save time on everyday tasks while opening up a new engagement opportunity for business
- Operator is powered by a new model called Computer-Using Agent (CUA). Combining GPT-4o’s vision capabilities with advanced reasoning through reinforcement learning, CUA is trained to interact with a GUI
- Justin was going to try it, but he forgot that the Pro plan is $200 dollars a month – so our listeners have to wait on his review of that one.
06:52 📢 Jonathan – “I like Operator. What I really like to see though is I don’t want to have to have it open in the browser. I don’t want to watch it doing its work.”
08:09 Cutting-edge Chinese “reasoning” model rivals OpenAI o1—and it’s free to download
DeepSeek panic triggers tech stock sell-off as Chinese AI tops App Store
- There’s a lot of jokes here, but we’re going to keep it professional – you’re welcome or we’re sorry, depending on your maturity level.
- DeepSeek has turned the AI world upside down over the last week.
- Last week, Chinese AI lab DeepSeek released its new R1 model family under an open MIT License, with its largest version containing 671 billion parameters.
- The company is claiming that the model performs at the levels comparable to OpenAI’s o1 simulated reasoning model on several math and coding benchmarks.
- In addition to the main deepseek-r1-main and deepseek-r1 models, they released 6 smaller distilled versions ranging from 1.6 billion to 70 billion parameters.
- These distilled models are based on existing open source architectures like Qwen and Llama, trained using data generated from the full R1 model. The smallest version can run on a laptop, while the full model requires far more substantial computing resources.
- This stunned the AI market, as most open-weight models which can often be run and fine-tuned on local hardware, have lagged behind proprietary models like OpenAI o1 in so called reasoning-benchmarks.
- Having these capabilities available in a MIT licensed model that anyone can study, modify or use commercially potentially marks a shift in what’s possible with a public model.
- The stock market panicked in response, with companies like Nvidia down 17% percent on Monday this week – based on the fact that DeepSeek jumped to the top of the app store free downloads, and the fact its low-cost and freely available.
- The three things that have investors and researchers shocked:
- The Chinese startup that trained the model for only $6 million (reportedly 3% of the cost of training Open AI o1) as a so-called “side-project” while using less powerful NVIDIA H800 AI acceleration ships due to US export restrictions on cutting-edge GPU.
- It appeared just four months after OpenAI announced o1 in September 2024.
- Released them under MIT license.
- This led investors to see that American Tech companies – which have thrived on proprietary and closed models, have “no moat,” which means that any technological lead led by cutting-edge hardware or impressive bankrolls doesn’t protect them from startup challenges.
- The question is it really any good, and can they scale to continue to maintain this with limited access to future GPUs.
10:57 📢 Ryan – “The impact the story has had this week has been a roller coaster. Like, and I don’t know if that’s just because I’ve been busy and sort of half paying attention. And, now, it wasn’t really until we were preparing for the show that I really dove in to figure out what, what this was after seeing it. Like, you know, first it was like a Chinese app taking over the phones. I thought it was security concerns and all this stuff, especially with all the Tik Tok stuff that’s going on. And then to find out it was an AI model, I’m like, it’s just, there’s other Chinese AI models, then the impact on Nvidia stock. So it was kind of crazy to see all of this happen. And it really just proves that the AI market right now is just very volatile and very subject to change.”
Cloud Tools
20:19 Enabling fast, safe migration to HCP Terraform with Terraform migrate (tf-migrate)
- Migrating to HCP Terraform can be a bit of a pain, especially when it comes to factoring your state file transitions.
- When you need to migrate from CE to HCP Terraform or Terraform Enterprise, state file management during that migration is the biggest challenge.
- This led Hashicorp to build TF-Migrate, a utility for automating state migrations to HCP Terraform and Terraform Enterprise. It can also be used to simplify workspace setup and supports modular refactoring.
- There are future enhancements in the works:
- Integration with source code systems like Github, to enhance migration workflows by embedding migrations configurations directly into repositories.
- Enhancing and extending the migration capabilities to support variables, modules and private registries between multiple terraform deployment options
- Improve handling of sensitive data during migrations, such as secrets or access tokens
- Further integration with Terraform Enterprise and Terraform Cloud to enhance governance by offering centralized control over migration tasks, audit trails, and policy enforcements.
21:44 📢 Ryan – “Anytime you have state conflict due to either data recovery or just try and reconcile manual actions that have happened since or anything like that, it’s always so painful. So I’m really happy to see tools like this exist. And it’s just another example of HashiCorp building in really usable functionality, whether it’s upgrading your code to the newest Terraform version or migrating state files. I like this a whole lot.”
23:53 Sysdig extends Wireshark’s legacy with Stratoshark for cloud environments
-
- Sysdig Inc. announced the launch of Stratoshark, a new open source tool that extends Wireshark granular network visibility into the cloud and provides users a standardized approach to cloud system analysis.
- Wireshark is over 27 years old, with over 5 million daily users and has had over 160 million downloads to help you analyze network traffic and troubleshoot issues.
- However, as companies move to the cloud, analysts have lacked the same visibility as a comparable open source tool.
- Stratoshark fills the gap, with features that unlock deep cloud visibility to assist in analyzing and troubleshooting cloud system calls and logs with a level of granularity and workflow familiar to longtime wireshark users.
- “Wireshark revolutionized network analysis by democratizing packet captures, a concept that Sysdig brought to cloud-native workloads and Falco extended to cloud runtime security,” said Gerald Combs, Stratoshark and Wireshark co-creator and Sysdig director of open-source projects. “Wireshark users live by the phrase ‘pcap or it didn’t happen,’ but until now cloud packet capture hasn’t been easy or even possible. Stratoshark helps unlock this level of visibility, equipping network professionals with a familiar tool that makes system call and log analysis as accessible and transformative for the cloud as Wireshark did for network packet analysis.”
- Stratoshark leverages Falco libraries, repositories and plugins to unite deep cloud visibility with familiar wireshark functionality.
- Falco is an open-source runtime security tool created by Sysdig that detects and alerts on unexpected behavior in a cloud-native environment , such as K8.
29:30 📢 Ryan- “It’s a magic trick. I’ve used Wireshark to sort out issues that people were blaming and all kinds of different things. I remember sorting through a Java heap problem because of Wireshark outputs and timing differences and a whole bunch of things. It really is something I can break out and it looks like the ancient times tool, but it really does help.”
31:02 OpenVox: The Community-Driven Fork of Puppet Has Arrived
- The OpenSource Puppet community has forked Puppet into OpenVox.
- This fork sprang from Puppet’s owner, Perforce, moving Puppet’s binaries and packages to private, hardened, and controlled locations.
- In addition, community contributors would have limited access to the program, and usage beyond 25 nodes will require commercial licenses.
- These changes have been resisted by long-time Puppet users and contributors who started this fork.
- Initially referred to as the OpenPuppetProject, the community, now known as Vox Pupuli, has settled on OpenVox as the fork’s name.
- They intend to continue Puppet’s work while adhering to the open source principles.
- A github repository has been set up, and discussions are ongoing regarding the project organizational structure and future direction.
- The intent is this to be a soft fork, with the desire to maintain downstream compatibility for as long as possible. As well as the puppet standards steering committee will include seats representing the whole community, including perforce, whether they want to join or not.
- They don’t fully plan to follow puppet with plans including:
- Modernizing the OpenVox codebase and ecosystem, in particular the developers plan to support current OS and Ruby versions rather than relying on fifteen-year-old unmaintained ruby gems
- Recentering and focusing on community requirements. Actual usage patterns will drive development rather than which customers have the deepest pockets
- Democratizing platform support, instead of waiting for Puppet to support the current Unbuntu Linux, community members can contribute to the projects themselves.
- Maintaining an active and responsive open-source community. Ie: YES, your pull request will finally get reviewed.
35:12 📢 Jonathan – “I think with AI, as mature as it is and as mature as it’s getting, it’s not going to be long before you can point a set of AI agents at any product you like and say, build me this thing that does exactly the same thing as this. And by the way, work around these patterns that they have. And we’ll be able to reproduce anything very cheaply, very quickly. I think I wouldn’t want to be in SAS right now or any kind of software, to be honest.”
AWS
36:44 CloudWatch provides execution plan capture for Aurora PostgreSQL
- Cloudwatch Database insights now collects the query execution plans of top sql queries running on Aurora PostgreSQL instances and stores them over time. This feature helps you identify if a change in the query execution plan is the cause of the performance degradation or a stalled query.
- Execution plans are available exclusively in the advanced mode of cloudwatch database insights.
38:06 AWS Client VPN announces support for concurrent VPN connections
- AWS is announcing the general availability of concurrent VPN connections for AWS client VPN, making your security people sad – but the people who have to do real work are going to be really happy.
- This feature allows you to securely connect to multiple Client VPN connections simultaneously, enabling access to your resources across the different environments.
38:19 📢 Matthew – “And now we have to use Wireshark to figure out where all of our connections are going.”
40:01 AWS announces new edge location in the Kingdom of Saudi Arabia
- AWS is expanding the KSA region with Amazon CloudFront edge location in Jeddah.
- The new AWS edge location brings the full suite of benefits provided by Amazon Cloudfront, a secure, highly distributed, and scalable CDN.
- When doing research we came across this gem: For the Kingdom of Saudi Arabia (KSA) location, you must use location-specific URLs to access the jurisdictional Google Cloud console, as well as some methods and commands in the gcloud CLI, the Cloud Client Libraries, and the Security Command Center API. WHAT? WHY?
42:23 Announcing general availability of AWS Managed Notifications
- AWS is announcing the GA of AWS Managed notifications, a new feature of AWS user Notifications that enhances how customers receive and manage AWS health notifications.
- Justin loves these, and would love everyone to send him some.
- This feature allows you to view and modify default AWS health notifications in the console notifications center, alongside your custom notifications such as cloudwatch alarms.
43:09 📢 Ryan – “I mean, they’ve been working towards this in a while, you know, for a long while. remember previewing something that was similar to this. The idea is that instead of blasting the email account that you associate with your AWS account, you can tune it to specific things and, to be specific, you can have multiple targets depending on the alert, right? And that makes a lot more sense. But it still hasn’t really reconciled itself into something usable in a lot of ways. it’s, I don’t know how to get, you know, anyone to read them, you know, their database engine is, you know, two versions out of support and they need to update and, then also have the same list, you know, manage the outages that AWS might experience. so like, it’s, it’s just sort of weird in order to configure this and deal with this and it’s a strange problem that I don’t quite know the right solution to.”
47:42 Announcing upcoming changes to the AWS Security Token Service global endpoint
- AWS launched STS in August 2011 with a single global endpoint (https://sts.amazonaws.com), hosted in the US East Region.
- To reduce dependencies on a single region, STS launched AWS STS Regional endpoints in February 2015.
- These regional endpoints allow you to use STS in the same region as your workloads, improving performance and reliability.
- However, customers and third-party tools continue to call the STS global endpoint, and as a result, these customers don’t get the benefits of the regional endpoints. To help improve resiliency and performance, they are making changes to the STS global endpoint, with no action required for you.
- Today all requests to the global endpoint are processed in the US east region. Starting in a few weeks, the STS global endpoint will be automatically served in the same region as your AWS deployed workloads. For example, if your app calls sts.amazonaws.com from the us-west region, your call will be served locally via the US west region STS service.
- This will apply for all regions that are enabled by default, for opt-in regions or if you’re using STS outside of AWS they will still be handled by US_east.
- CloudTrail logs for global STS endpoints will still be sent to the US-East region.
- CloudTrail logs will have additional metadata fields including EndpointType and awsServingRegion to clarify which endpoint and region served the request.
- Requests made to STS.amazonaws.com endpoints will have a value of us-east-1 for the requested region condition key, regardless of which region served the request.
- Requests handled by the STS endpoint will not share a request quota with the region STS endpoint.
52:009 📢 Justin – “I imagine if they retire this, it breaks all of us East one forever.”
53:09 Amazon S3 Metadata is now generally available
- AWS is announcing the GA of Amazon S3 metadata. S3 metadata provides automated and easily queried metadata that updates in near real time, simplifying business analytics, real-time inference applications, and more. S3 metadata supports object metadata, which includes system defined details like size and source of the object, and custom metadata, which allows you to use tags to annotate your objects with information like product SKU, transaction ID or content rating.
53:39 📢 Ryan- I’ve needed this for a long time, and I’ve done some crazy work arounds. I’m glad to see they’re rolling it out there, because it is super useful.”
GCP
54:28 Introducing BigQuery metastore, a unified metadata service with Apache Iceberg support
- Google is releasing the public preview of Bigquery Metastore, a fully managed unified metadata service that provides processing engine interoperability while enabling consistent data governance.
- BigQuery metastore is a highly scalable runtime metadata service that works with multiple engines, for example, BigQuery, Apache Spark, Hive and Flink and supports the Apache Iceberg table format.
- This allows analytics engines to query one copy of the data with a single schema, whether the data is stored in BigQuery storage tables, BigQuery tables for Apache Iceberg, or BigLake External tables.
54:48 Safer automated deployments with new Cloud Deploy features
- Cloud Deploy is getting several new features this week, but all of these are in preview, so don’t rip out your current CD solutions yet.
- Repair Rollouts, lets you retry failed deployments or automatically roll back to a previously successful release when an error occurs.
- This can come in any phase of the deployment from a sql migration, a misconfiguration detected when talking to a GKE cluster or as part of a deployment verification step.
- Deploy policies limit what the automation or users can do. Initially, their launching time-windows policy, which can, for example, inhibit deployments during evenings, weekends, or during important events. While an on-caller with the policy overrider role could “break glass” to get around the policies, automated deployments won’t be able to trigger during the middle of a big demo
- Time promotions, after a release is successfully rolled out, you may want to automatically deploy it to the next environment. Previous auto-promote features let you promote a release after a specified duration, for example moving it into prod 12 hours after it went to staging. But often you want promotions to happen on a schedule, not based on a delay.
56:56 📢 Matthew – “I miss a good code deploy cloud deploy tool. That’s all I have to say here.”
59:53 Introducing agent evaluation in Vertex AI Gen AI evaluation service
- Google is announcing Vertex AI Gen AI evaluation service in preview. This new feature empowers developers to rigorously assess and understand their AI agents. It includes a powerful set of evaluation metrics specifically designed for agents built with different frameworks, and provides native agent inference capabilities to streamline the evaluation process.
1:00:58📢 Justin – “I don’t know how it works, I just know that’s what they’re doing.”
1:02:18 Announcing smaller machine types for A3 High VMs
- You can now get A3 High VM powered by Nvidia H100 80gb GPUs in multiple machine types including 1, 2, 4 and 8 GPU options.
- As well as support for Spot market pricing as well as integration into vertex.
Off Topic, But Interesting,,,
1:04:38 New Year, New OS. Supporting your business with ChromeOS Flex
- If you have some old laptops or computers hanging around, you can now deploy a no-cost, easy to deploy solution to breathe new life into them.
- With just a USB stick, you can install ChromeOS Flex and transform aging laptops, kiosks and more into fast, secure and modern devices.
- Google says it’s the perfect solution for businesses hoping to refresh devices, improve security, and embrace sustainability.
- Going into 2025 they’ve certified over 600 devices to work effortlessly with Chrome Flex.
1:06:15 📢 Jonathan- “I like the idea of what they’re doing. I think if it saves a bunch of stuff going in a landfill or something and brings some new life into things for a few more years, that’s great. Especially as Windows 11 is only supporting newer CPUs and TPMv2 and things like that. It’s super annoying that the OS vendor would do that.”
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod