
Welcome to episode 291 of The Cloud Pod – where the forecast is always cloudy! Justin, Jonathan, and Ryan have battled through the various plagues and have come together to bring you all the latest in cloud news, including Kro, DeepSeek, and CoPilot.
Titles we almost went with this week:
- 🏮In Shocking News China Steals US IP
- 🏛️The Cloud Pod is Now Supported in Gov Cloud
- 🔥Microsoft Goes Open Source No SQL… and Hell Hasn’t Frozen Over
- 🧟Zombie Buckets Receive How Much Traffic?!?
- 🍽️AWS, GCP and Azure eat KRO
- 🧑✈️Github Copilot for Free, so You Can Win at Coding Interviews
- 💭Customized Best Practices… I don’t think you know what best practices are
- ☁️TheCloudPod Leverages Deep Understanding to Make a Nuanced Decision on adopting Copilot
A big thanks to this week’s sponsor:
We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our slack channel for more info.
Follow Up
01:23 Is DeepSeek really sending data to China? Let’s decode
- One of the early concerns about DeepSeek was its privacy implications, starting with their privacy policy.
- Allegations are significant but reality is if the open source model is hosted locally or orchestrated via GPUs in the US the data does not go to China.
- But if you’re using the DeepSeek app it clearly states in the privacy policy that the data will be stored in China. Data hosted on Chinese servers can be seized by the Government at any time.
- Maybe rethink using the native DeepSeek websites and mobile apps and just host them locally in LM studio.
02:21 📢 Jonathan – “They’re collecting some weird data. I get collecting conversational data, because that is the business they’re in, but they’re also doing some weird stuff, like they fingerprint users by looking at the patterns of the way that they type. Not just what they type, but how they type, like the timing between hitting different letters – things like that.”
8:06 OpenAI Believes DeepSeek Was Developed Using OpenAI Models
- Listener Note: paywall article
- OpenAI says they have found evidence that the Chinese firm behind DeepSeek developed the AI using information generated by OpenAI’s models.
- This is prohibited by the OpenAI terms of service, and is a practice known as AI model distillation.
- With distillation, the developer asks existing AI models lots of questions and uses the answers to develop new models that mimic their performance.
- This shortcut results in models that roughly approximate state-of-the-art models but don’t cost a lot to produce
- OpenAI said last year it would sell access to its models directly to customers based in China, while MS has continued to resell OpenAI models through its Azure cloud service to Chinese customers.
09:15 📢 Justin- “Oh, you mean the company that stole all the internet data in the world to create a model is complaining about another company stealing their data?”
General News
11:42 Abandoned AWS S3 buckets can be reused in supply-chain attacks that would make SolarWinds look ‘insignificant’
8 Million Requests Later, We Made The SolarWinds Supply Chain Attack Look Amateur
- watchTowr Labs security researchers are claiming that Abandoned AWS S3 buckets could be reused to hijack the global software supply chain in an attack that would make “Solarwinds look amateurish and insignificant.”
- The researchers report that they have identified 150 buckets that were long gone, yet applications and websites are still trying to pull software updates and other code from them.
- If someone were to take over those buckets, they could be used to feed malicious software updates into peoples devices.
- The buckets were previously owned by governments, fortune 500 firms, technology and cybersecurity firms and major open source projects.
- The watchTowr team spent <500 dollars to re-register 150 S3 buckets with the same names and enabled logging to determine what files were still being requested and by what.
- Then, they spent 2 months watching the requests.
- During the 2 months, the S3 budget received more than eight million requests for resources including Windows, Linux, and macOS executables, virtual machine images, javascript files, cloud formation templates and SSLVPN server configurations.
- Coming from all over includes Nasa and US government networks, along with government organizations in the UK and other countries.
- Watchtower CEO Benjamin Harris said that it would be terrifyingly simple to pull off an exploit in this way.
- BTW, Justin super approves of this company as they use a lot of Memes in their article. 🙂
- AWS took the S3 buckets off Watchtower’s hands and sinkhole-d them, so these 150 are no longer being used… but how many more exist out there?
- They didn’t really break down how they found them, but it’s probably not very hard to find.
13:55 📢 Jonathan – “It’s no different than domain registrations expiring, or getting somebody’s phone number after it’s been advertised…I feel like they’re pointing the finger at Amazon a little more than they should. To say that it’s a supply chain attack is kind of a stretch because these companies don’t exist anymore, that’s why the buckets are gone – so it’s a dead supply chain attack
AI is Going Great – or How ML Makes All It’s Money
20:19 Introducing ChatGPT Gov
- OpenAI is releasing a version of openAI that is targeted at the public sector. They believe the US Government’s adoption of AI can boost efficiency and productivity and is crucial for maintaining and enhancing America’s global leadership.
- By making the products available to the US government, they aim to ensure AI serves the national interest and the public good, aligned with democratic values, while empowering policymakers to responsibly integrate capabilities to deliver better services to the American people. (Side note, did anyone else lol at this?)
- ChatGPT Gov, a new tailored version of ChatGPT designed to provide US government agencies with an additional way to access OpenAI’s frontier models.
- Agencies can deploy ChatGPT Gov in their own MS Azure commercial cloud or Azure Government cloud on top of the Microsoft Azure OpenAI service.
- Self-hosting ChatGPT Gov enables agencies to more easily manage their own security, privacy and compliance requirements, such as stringent cybersecurity frameworks (IL5, CJIS, ITAR and FEDRAMP) high.
- Additionally, they believe the infrastructure will expedite internal authorization of OpenAI’s tools for the handling of non-public sensitive data.
- ChatGPT Gov reflects their commitment to helping the US Government agencies leverage OpenAI’s technology today. While they continue to work towards FedRAMP moderate and high accreditations for their SaaS product, ChatGPT enterprise. They are also evaluating expanding ChatGPT Gov to Azure’s classified regions.
22:13 📢 Justin – “Remember back in the early days of Cloud Pod when we were talking about all the engineers protesting at the companies about the machine learning being used on video content for police forces, and I was thinking about that compared to this…I don’t know if people are going to protest this. They should. They probably should.”
23:23 OpenAI Revenue Surged From $200-a-Month ChatGPT Subscriptions
- Reportedly the $200 dollar ChatGPT Pro subscriptions have raised OpenAI revenue by $25M a month or at least $300M on an annual basis.
- I guess we don’t know what we are talking about… I’m still unclear what they’re buying with this other than the Vision capability they just launched.
- Interested in checking out the pricing models for yourself? You can do that – here!
25:04 📢 Ryan – “I do love that the rabbit holes that I fall into for internet research have now been outsourced to AI, so I can just have the robot do the rabbit hole.”
27:32 Introducing deep research
- ChatGPT has released Deep research in ChatGPT, a new agentic capability that conducts multi-step research on the internet for complex tasks. It accomplishes in tens of minutes what would take a human many hours.
- Deep Research, when prompted, will find, analyze and synthesize hundreds of online sources to create a comprehensive report at the level of a research analyst.
- Leveraging the OpenAI o3 model that is optimized for web browsing and data analysis, it leverages reasoning to search, interpret and analyze massive amounts of text, images and PDFs on the internet, pivoting as needed in reaction to information it encounters.
- Deep research was built for areas like finance, science, policy and engineering and needs thorough, precise, and reliable research.
- To use it, select Deep research in the message composer and enter your query. Tell Chat GPT what you need, and whether it’s a competitive analysis on streaming platforms or a personalized report on the best commuter bike. You can attach files and spreadsheets to add context to your question. Once it starts running, a sidebar appears with a summary of the steps taken and sources used.
- Deep research may take anywhere from 5 to 30 minutes to complete its work, taking the time needed to dive deep into the web
30:05 Announcing DeepSeek-R1 in Preview on Snowflake Cortex AI
- All the cloud providers are starting to offer DeepSeek, with the first up this week being Snowflake Cortex AI.
- The model is available in private preview for serverless inference for batch and interactive.
- The model is hosted in the US with no data shared with the model provider.
- Once GA, you’ll be able to manage access to the model via role-based access control (RBAC).
30:31 📢 Justin – “So if you want to try Deep Seek in a safer environment, Snowflake is your friend.”
Cloud Tools
31:02 Introducing Qonto’s Prometheus RDS Exporter – An Open Source Solution to Enhance Monitoring Amazon RDS
- Databases are a critical part of your infrastructure, and if you’re using AWS RDS, the ability to get metrics like CPU, RAM, IOPS, Storage or service quotas is critical, but challenging when the number of RDS instances increases to the 10s, hundreds or thousands of databases to monitor.
- This is why a standardized approach to database monitoring can help administrators save time and help scale their business with lower risk.
- Qonto, a leading payment institution that offers a panel of banking services to small businesses with simplicity, has published a unified framework for Amazon RDS monitoring which helps them deploy best practices at scale and monitor hundreds of databases with limited effort.
- This automation comes as the Prometheus RDS Exporter for Amazon RDS monitoring, and they have open sourced it under an MIT license.
- Qonto wanted to aggregate key RDS metrics and push them into prometheus for monitoring and alerting purposes.
32:01 📢 Ryan – “I do like the sort of standardization that Prometheus has brought. I get a little frustrated sometimes with some of the use cases, because it’s a big, big hammer that can be set up to solve little problems. But something like this, if you’ve got enough scale, where you’re struggling to visualize and see metrics across hundred of Amazon accounts, and then maybe you’ve got other applications that’s using OpenTelemetry – I think this is pretty cool that you can standardize it and put it all in one place.”
AWS
35:38 Amazon Redshift announces enhanced default security configurations for new warehouses
- Amazon Redshift announces enhanced security defaults to help you adhere to best practices in data security and reduce the risk of potential misconfigurations.
- These changes include disabling public access, enabling database encryption, and enforcing secure connection by default when creating a new data warehouse.
- AMEN.
39:18 DeepSeek-R1 models now available on AWS
- Amazon is also providing you access to DeepSeek R1 models in Bedrock and Amazon Sagemaker AI.
- As this is a publicly available model you only pay for the infrastructure price based on the inference instance hours you select for Bedrock, Sagemaker jumpstart and Ec2.
40:06 Amazon EC2 now supports automated recovery of Microsoft SQL Server with VSS
- In horrible ideas, you can now make automated recovery for MSSQL Server databases from VSS-based EBS snapshots. Customers can use an AWS Systems Manager runbook and specify a restore point to automate recovery without stopping a running MSSQL Database.
- VSS allows application data to be backed up while applications are running. This new feature will enable customers to automate the recovery from VSS-based EBS snapshots and ensure rapid recovery of large databases within minutes.
40:38 📢 Justin – “Just use SQL backup natively please.”
GCP
04:38 Introducing custom rules in Workload Manager: Evaluate workloads against customized best practices
- Workload Manager provides a rule-based validation service for evaluating your workloads on Google cloud.
- Workload Manager scans your workloads, including SAP and MSSQL to detect deviations from standards, rules and best practices to improve system quality, reliability and performance.
- Now you can extend workload manager with custom rules (GA), a detective-based service that helps ensure your validations are not blocking any deployments, but that allows you to easily detect compliance issues across different architectural intents.
- This can be used against projects, folders and orgs against best practices and custom standards.
- To get started you codify best practices in Rego, a declarative policy language that’s used to define rules and express policies over complex data structures, and run or schedule evaluation scans across your deployments.
- Than you export the findings to bigquery dataset and visualize them using looker
43:44 📢 Ryan – “I mean, I do like these types of workflows, and the reason I like them is so you can practice security without everything being in force mode. And if you’re allowing direct access to clouds, then you are allowing the users in the company to not have to through a centralized team, or an infrastructure team…and you’re going to end up with insecure configurations, because random people are clicking through defaults.”
45:22 Blackwell is here — new A4 VMs powered by NVIDIA B200 now in preview
-
- Google is bringin the NVIDIA Blackwell GPU to google cloud with the preview of the A4 VMs, powered by NVIDIA HGX B200.
- The A4 VM features eight of the Blackwell GPU’s interconnected by fifth-generation NVIDIA NVLink, and offers a significant performance boost over the previous generation of A3 High VM.
- Each GPU delivers 2.25 times the peak compute and 2.25 times the HBM capacity, making A4 VMs a versatile option for training and fine-tuning for a wide range of model architectures, while increasing the compute and HBM capacity.
- The A4 VM integrates Google’s infrastructure with Blackwell GPUs to bring the best cloud experience for Google Cloud customers, from scale and performance, to ease-of-use and cost optimizations
- Enhanced Networking with the Titanium ML network adapter, optimized to deliver a secure, high-performance cloud experience for AI workloads, building on NVIDIA connectX-7 NICs.
- Google K8 Engine with support of up to 65k nodes per cluster. A4 VMs are natively integrated into GKE.
- Vertex AI will support the A4
- Pytorch and Cuda, work closely with NVIDIA to optimize JAX and XLA
- Hypercompute Cluster with tight GKE and SLURM integration
- “We’re excited to leverage A4, powered by NVIDIA’s Blackwell B200 GPUs. Running our workload on cutting edge AI Infrastructure is essential for enabling low-latency trading decisions and enhancing our models across markets. We’re looking forward to leveraging the innovations in Hypercompute Cluster to accelerate deployment of training our latest models that deliver quant-based algorithmic trading.” – Gerard Bernabeu Altayo, Compute Lead, Hudson River Trading
47:37 📢 Jonathan – “Yeah, the NVLink is really quite the performance booster here because consumer cards use PCIe very low bandwidth, relatively speaking. So I think that the real advantage in using these clusters that they put together is just because of the massive bandwidth between nodes in the cluster. And the real bottleneck in clustering GPUs is communication between nodes, which is why DeepSeek did some cool stuff with what they were doing in building their model. What they did is they, instead of using CUDA, they used low-level language, PTX, and they reassigned some of the cores to compress data and to work on optimizing network traffic between nodes, and that’s probably one of reasons they were able to do what they did with such kind of strange resources.”
49:55 Simplify the developer experience on Kubernetes with KRO
- Hell has NOT frozen over. (As far as we know.)
- Google, AWs and Azure have been collaborating on Kube Resource Orchestrator (Kro).
- Kro introduces a K8 native, cloud agnostic way to define groupings of K8 resources.
- With Kro, you can group your applications and their dependencies as a single resource that can be easily consumed by end users.
- Before Kro you had to invest in custom solutions such as building custom K8 controllers or using packaging tools like Helm, which can’t leverage the benefits of K8 CRDs.
- These approaches are costly to create, maintain, and troubleshoot and complex for non-k8 experts to consume. This is a problem many K8 users face. Rather than developing vendor-specific solutions, they have partnered with Amazon and Microsoft to make K8 APis simpler for all k8 users.
- Platform and devops teams want to define standards for how application teams deploy their workloads, and they want to use K8 as the platform for creating and enforcing these standards. Each service needs to handle everything from resource creation to security configurations, monitoring setup, defining the end-user interface and more. There are client-side templating tools that can help like Helm or Kustomize, but K8 lacked a native way for platform teams to create custom groupings of resources for consumption by end users
- Kro is a k8 native framework that lets you create a reusable API to deploy multiple resources as single units. This can be used to encapsulate K8 deployments and dependencies into a single API that your application teams can use, even if they aren’t familiar with K8.
- You can use Kro to create custom end-user interfaces that expose only the parameters an end-user should see, hiding the complexity of K8 and cloud-provider APIs.
- See the article for some example use cases.
52:59 📢 Ryan – “I can see this being easier to support within a business. But it still has all the problems that I don’t like about operators and custom resources, trying to make this the one the API for everything – on a very complex system.”
54:20 Announcing the general availability of Spanner Graph
- Spanner Graph is now Generally Available.
- Graph analysis helps reveal hidden connections in data and when combined with techniques like full-text search and vector search, enables you to deliver a new class of AI-enabled application experiences.
- The traditional approaches based on niche tools resulted in data silos, operational overhead and scalability challenges.
- It really is the tool looking for a solution.
55:58 AlloyDB Omni K8 Operator 1.3 GA
- This new operator has several nice features:
- K8 1.30 supports connection pooling
- You can put databases in maintenance mode.
- You can create replication slots and users for logical replication via the operator AP.
- Release of K8 operator adds support for kube-state-metrics so that you can use Prometheus or a prometheus-compatible scraper to consume and display custom metrics
- You can create a new database cluster, this version of the K8 operator creates RO and RW load balancers concurrently, which reduces the time that it takes for the database cluster to be ready
- Configurable log rotation has a default retention of seven days, and each archived file is individually compressed using Gzip.
- Various bug fixes and performance improvements.
56:54 📢 Justin – “This is nice, if you’re using Omni, and you want to do Kubernetes things.”
Azure
58:15 DocumentDB: Open-Source Announcement
- Microsoft is announcing the official release of DocumentDB — an open-source document database platform and the engine powering the vCore-based Azure Cosmos DB for MongoDB, built on PostgreSQL
- The project uses the permissive MIT Licenses.
- There are two components to the project:
- Pg_document_db_core – A custom PostgreSQL extension optimizing for BSON data type support in Postgres
- Pg_documentdb_api- the data plane for implementing CRUD operations, query functionality, and index management.
58:50 📢 Jonathan – “Why would they call it the same name as Amazon’s DB?”
59:50 Announcing a free GitHub Copilot for Visual Studio
- Microsoft has released a free plan for Github Copilot, available for everyone using Visual Studio.
- With the free version you get:
- 2000 code completions per month
- 50 chat messages per month
- Access to the latest AI models with Anthropic Claude 3.5 Sonnet and Open AI’s GPT-4o.
- Thanks for not charging us twice, we guess?
1:02:15 Announcing the availability of the o3-mini reasoning model in Microsoft Azure OpenAI Service
- We are pleased to announce that OpenAI o3-mini is now available in Microsoft Azure OpenAI service. O3-mini adds significant cost efficiencies compared to o1-mini with enhanced reasoning, with new features like reasoning effort and tools, while providing comparable or better responsiveness.
- New features of o3-mini
- Reasoning effort parameter
- Structured output
- Function and tools support
- Developer messages
- System Message compatibility
- Continue Strength on coding, math and scientific reasoning.
1:02:46 DeepSeek R1 is now available on Azure AI Foundry and GitHub
- Deepseek is also now available in the model catalog on Azure AI foundry and GitHub, joining a diverse portfolio of over 1,800, models including frontier, open-source, industry-specific, and task-based based AI models.
1:03:14 📢 Jonathan – “I’m really excited about what DeepSeek’s done. And I think it’s going to have a huge effect on the rest of the AI industry. Like they’ve completely reworked how the transformers work at a fairly fundamental level. And if we don’t see other people adopting the same changes that they’ve made, I’d be really surprised.”
Oracle
1:05:57 Oracle and Google Cloud Expand Regional Availability and Add Powerful New Capabilities to Oracle Database@Google Cloud
- Oracle and Google Cloud have announced plans to expand Oracle Database@Google Cloud by adding eight new regions over the next 12 months, including locations in the U.S., Canada, Japan, India, Brazil.
- In addition they are releasing new capabilities including:
- Cross-Region Disaster Recovery for Oracle Autonomous Database Serverless. Cool!
- Single-Node VM Clusters for Oracle Exadata Database Service on Dedicated Infrastructure.
Closing
And that is the week in the cloud! Visit our website, the home of the Cloud Pod where you can join our newsletter, slack team, send feedback or ask questions at theCloud Pod.net or tweet at us with hashtag #theCloudPod