Spotify's Memory Analysis on GKE & Instacart's Zero Downtime DB cutovers
This Week’s Spotlight
Spotify’s Analyzing Volatile Memory on a Google Kubernetes Engine Node
Instacart : Zero-Downtime PostgreSQL Cutovers
Spotify : Analyzing Volatile Memory on a Google Kubernetes Engine Node
Summary
Spotify uses Google Kubernetes Engine (GKE) to manage its applications. They monitor their workloads for suspicious behavior and use commercial solutions for this. They developed a new open-source method to analyze memory on GKE nodes using three tools: AVML, dwarf2json, and Volatility 3. This method provides a snapshot of processes and memory activities on a GKE node, helping detect potential malicious activities. By combining these tools, Spotify and other organizations can have an open-source option for memory analysis when commercial solutions are not available or for comparison. This approach enhances monitoring for their large-scale container workloads on GKE. Read full article
Key takeaways
Spotify relies on Google Kubernetes Engine (GKE) to manage its applications across various regions.
Spotify developed an open-source approach using AVML, dwarf2json, and Volatility 3 to analyze memory on GKE nodes, aiding in identifying potential malicious activities.
The method provides a quick snapshot of processes and memory actions on GKE nodes, enhancing security monitoring.
This open-source solution offers an option for memory analysis in cases where commercial alternatives are unavailable or for comparative analysis.
By utilizing this approach, Spotify and other organizations can bolster the security of their container workloads on GKE.
Instacart : Zero-Downtime PostgreSQL Cutovers
Summary
Instacart uses a large number of PostgreSQL databases on AWS RDS and needs to update them without interrupting their services. They've developed a method called Blue/Green using Logical Replication. This lets them switch between old and new database versions seamlessly, with only a few seconds of downtime. They've built a tool called Zero Downtime Cutover (ZDC) to simplify this process. It sets up a copy of the database, performs updates and maintenance on it, then switches over to the updated version quickly. They've used this tool successfully for various tasks like upgrades and optimizations, reducing downtime, and improving database health. They're also planning to automate this process further and extend its capabilities. Read full article
Key takeaways
Instacart uses Blue/Green approach with Logical Replication to update PostgreSQL databases, ensuring seamless transitions and seconds of downtime.
Zero Downtime Cutover (ZDC) tool simplifies complex database updates, enabling easy creation, maintenance, and switching of replicas.
ZDC's pre-checks ensure data compatibility, replication setup, and other requirements before initiating updates, reducing human error risks.
ZDC's step-by-step orchestration, including proxy redirection, replication validation, and rollback mechanisms, minimizes downtime during cutover.
ZDC significantly reduces upgrade and maintenance downtime, bolstering Instacart's confidence in database updates and improving user experience.