Skip to main content
Podcast Hero Image

#1 SRE Podcast

Incidentally Reliable Podcast

Learn how elusive reliability can be, through the journeys, challenges, and triumphs of those who work to make our digital lives increasingly reliable.

Available on all your favourite platforms

Featured

How Solomon Hykes Disrupted Deployments, From Docker to Dagger

Listen to Solomon Hykes — the co-founder of Docker and Dagger — talk about the evolution of containerization, reliability in DevTools, and the criticality of decisions when you're building software used by every engineer in the world.

Podcast Basic Image

Episodes

Press Start to Scale: SRE in Gaming - Incidentally Reliable with Denys Pashutynski

Episode #13

Press Start to Scale: SRE in Gaming - Incidentally Reliable with Denys Pashutynski

In our latest episode, we speak with Denys Pashutynski, Senior Engineering Manager of Site Reliability at Roblox, about the formidable challenges of sustaining a global gaming platform. Drawing from his tenure at Twitter, AWS, and eBay, Denys delves into managing traffic surges, latency optimization, and strategic change management.

Battle-Tested Reliability Strategies with Abhishek Ghosh

Episode #12

Battle-Tested Reliability Strategies with Abhishek Ghosh

We dive into the trenches with Abhishek Ghosh, a veteran who has led SRE teams at Pinterest, and now at Cribl. He shares gripping war room stories from Pinterest, strategies for maintaining uptime, insights into the role of AI in observability, and more! Discover the future of SRE and learn how to navigate the challenges of digital reliability. Tune in to gain valuable lessons from one of the industry's leading experts.

The Science of Building Cloud Native DevTools

Episode #11

The Science of Building Cloud Native DevTools

Catch Ramiro Berrelleza — Founder and CEO at Okteto talk about how impactful DevTool startups are built, the importance of investing in Developer Experience, and the emerging issues with the Cloud Native ecosystem

Credit-Worthy Reliability

Episode #10

Credit-Worthy Reliability

Catch Krishnendu Majumdar talk about his journey in the dynamic Indian startup ecosystem, strategies to build for scale from Day 1 and insights into building sustained user trust via exceptional product performance in high governance industries like credit and finance

Reliability For The Books with Niall Murphy

Episode #9

Reliability For The Books with Niall Murphy

Catch Niall talk about graceful degradation, what startups are getting wrong about reliability and how well-thought user-experience can communicate credibility to current and potential customers

podcast episode

Have Someone in Mind?

Do you have a story to share? Or know someone who does? We would love to hear from you! Apply or nominate someone to be a guest on our podcast.

Incidentally Reliable Blogs

Byte sized content from the front-lines of Site Reliability.

podcast episode

Subscribe & Get Goodies

We give away Incidentally Reliable T-shirts, Mugs, Backpacks and more to 3 subscribers every month!

Learn how 1000+ companies across the globe bring down their MTTRs by 60%

strip Logo Images
strip Logo Images
zenduty_g2_reviews_summer
zenduty_g2_reviews_winter
zenduty_g2_reviews_fall

4.6 out of 5 on G2

You're in Good Hands

E COMMERCE

IndiaMart currently has over 152,000 paying subscribers who make up 95% of the revenue. This makes it very important for the company to deliver an uninterrupted online experience as it directly impacts the subscriber experience and thence the revenue. Our most important KPI is to ensure uptime of our all production websites and without Zenduty, we can not do this. Zenduty plays a vital role in maintaining uptime of our websites by providing alerts timely.

Vinay Singh

DevOps Manager, IndiaMart

FINTECH

Zenduty helps the team keep a track of weekly occurring, re-occurring issues, we design the on-call schedule on the tool to escalate the alerts to the on-call engineer and provide us with a robust interface to manage the incident within Slack, which is our team communication channel. And lastly, the MTTA and MTTR are recorded and visualized on the tool, to help us compare the actual and target numbers for improvements. We are happy to pay for on-call tools because the value this tool adds for our engineers and customers is much higher than the money.

Rohit Khatana

VP of Engineering, Qoala

ENTERTAINMENT

It's a great incident management tool that helps us enable faster and better incident resolution. We have all our critical applications and system alerts configured on Zenduty. We manage end-to-end Incidents on Zenduty.

Mohammed Shabbir S

Technical Support Lead, Bookmyshow

FINTECH

I like Zenduty’s intuitive user experience throughout, be it the web UX, the Android UX, and most of all the Slack UX. We pretty much manage all our incidents from sitting inside a Slack channel and that’s awesome. Other than being a robust product, I found the people providing support are the best. As an operations team lead myself, I understand the day-to-day toil and stress support can cause. The Zenduty team has just always been there, friendly, coolheaded, and ready to take action on anything thrown at them.

Heinrich Roets

Operations Support, Electrum Payments.

TRAVEL

Zenduty has been great so far in terms of delivering meaningful alerts to the right person quickly, which has also improved our uptime significantly. We did not face any challenges with the onboarding, and received great support from everyone in the team - further reducing our friction while moving to Zenduty. The turnaround for most of our requests was super quick and the team was very helpful while resolving our queries!

Atmesh Mishra

Associate Vice President - Platform at Chalo

HOSPITALITY

All (infrastructure downtime) alerts by default go to DevOps receiver … which trigger SMS and a phone call to the concerned on-call developer based on the configuration provided. (Zenduty) ... is pretty intuitive with great support and configuring them with escalation policies is a piece of cake.

Abhay Kumar

Lead DevOps Engineer at Zolo Stays

TECH

Zenduty serves as a command and control center for our DevOps/incident management functions. Over time, we have deployed a whole bunch of monitoring services through AWS, Datadog, GitHub, Monit, Python scripts, etc, and it has become quite a challenge to centralize the alerts and act upon them. Zenduty gives a clean way of pulling together these alerts, classifying them, assigning them, acting on them and then sending out updates.

Anirban Mazumdar

CTO, Urbanpiper

TECH

Easy to configure, intuitive platform that triggers alerts from our monitoring tools such as Datadog, AWS Cloudwatch, GCP, etc, and helps us respond to incidents faster.

Felipe Urbina

CTO, Simpliroute

SIGNUP TODAY FOR FREE

NO CREDIT CARD REQUIRED

Be Prepared for Incident Response with Zenduty

Your robust incident management and alerting platform. Organizations have experienced reduced alert fatigue with over 60% reductions in MTTA & MTTR while experiencing growth.