Business continuity

Disaster recovery for SMEs:
from the plan to recovery

·Disaster RecoveryBusiness ContinuityBackupRPO/RTOSME

In short: disaster recovery (DR) is the ability to bring systems and processes back into operation after a serious event — a failure, a human error, a ransomware attack. It is not the same as backup: copies are only one piece. What makes the difference is a plan measured by two numbers — RPO (how much data you can lose) and RTO (how quickly you are back up) — and, above all, tested. And beware: the RTO you declare is not the one you will actually live. Here is what an SME needs, without unnecessary jargon.

RPOHow much data you can lose
RTOHow quickly you are back up
Immutable backupRansomware-proof

Backup, disaster recovery and high availability are not the same thing

These three concepts are often confused, and the confusion is costly when it really matters. Backup is the copy of data that lets you recover it after a deletion, a failure or an attack. Disaster recovery is the set of plan, procedures and resources that brings systems and processes back into operation after a serious event. High availability (HA), instead, is meant to avoid downtime through redundancy — another matter entirely.

The most important distinction is this: high availability does not protect against ransomware or logical deletion. If a system is encrypted or a piece of data is deleted by mistake, synchronous redundancy propagates the problem to the copy instantly. That is why backup and disaster recovery remain essential even when the infrastructure is redundant: they answer different questions.

The two numbers that govern everything: RPO and RTO

A disaster recovery plan is measured by two parameters. RPO (Recovery Point Objective) is the maximum amount of data you can afford to lose, expressed in time: if you back up every 24 hours, in the worst case you lose a day of work. RTO (Recovery Time Objective) is the maximum time within which you must be operational again after the incident.

The key point is that RPO and RTO are defined per system, based on its criticality. The line-of-business application that halts production will have an RPO and RTO of a few hours; a historical document archive can tolerate a day or more. It is these two numbers that say how much to invest in backup and redundancy — not the other way around. Defining them is a business decision before a technical one: they quantify what each process is worth, in time and data.

The RTO on paper almost always lies

There is a systematic gap between the RTO written in the plan and the real recovery time. Two reasons, almost never put in writing, blow it apart.

The first is arithmetic. A 4-hour RTO often coexists with backups of tens of terabytes and a line of a few hundred Mbit/s: restoring 20 TB over a 200 Mbps link takes days, not hours, by the sheer physics of the transfer. The number on the plan stays 4; the stopwatch says nine days. It is the calculation nobody does at planning time — and it changes everything: low RTOs require fast local copies, replicas already standing by or physical seeding, not just a backup “in the cloud”.

The second is forensic. In a ransomware attack the attackers stay on the network for weeks before encrypting. The most recent “clean” backup already contains their backdoor: restoring it means putting the attacker back into production too. The real question is not “do I have the backup?” but “how far back do I have to go to find a clean one?”. Finding that point takes analysis, and while you look the real RTO drifts away from the promised hours. This is why a long history of restore points and the ability to verify a backup before trusting it matter.

What a DR plan actually contains

A disaster recovery plan is not a document for the drawer: it is what allows people under stress to act in a coordinated way. The elements that cannot be missing:

  • 1Inventory and priorities— which systems and processes are critical, in what order to restore them, with which RPO and RTO.
  • 2Risk scenarios— hardware failure, human error, ransomware, site disaster: each requires a different response.
  • 3Roles and responsibilities— who decides, who executes, who communicates. Without clear roles, the plan stalls at the first surprise.
  • 4Operational runbooks— the step-by-step recovery procedures, written so they can be executed even by someone who did not draft them.
  • 5Recovery resources and communication — where you restore (alternate site, datacenter or cloud) and how you inform clients, suppliers and, where applicable, the authorities.
  • 6Periodic testing— the most neglected and most important element: a plan never rehearsed is a hypothesis, not a guarantee.

Recovery after ransomware: the acid test

Ransomware is the scenario that tests a DR plan like no other, because it goes straight for the copies. Attackers seek out and encrypt the backups reachable from the network before hitting the systems: if your copies are online and accessible with the same credentials as the infrastructure, they are not a guarantee.

The countermeasures that change the outcome: immutable backups (not modifiable or deletable for a defined period, not even by an administrator), offline or air-gapped copies, credentials separate from the production domain, and a defined recovery order (identity and domain controllers first, then critical systems). Operational speed matters too: in the cloud, tools such as bulk restore of virtual machines bring dozens of servers back in parallel rather than one at a time. And first of all: verify that the backups are not already compromised. On this front, a managed incident response service makes the difference between a downtime of hours and one of weeks.

Disaster recovery in the cloud for SMEs

The cloud is an excellent disaster recovery enabler, but it does not replace it. Azure Backup offers immutable copies and soft-delete; Azure Site Recovery replicates virtual machines to a different region and enables failover; and the data in Microsoft 365 stays yours and must be protected with a dedicated backup — the shared responsibility model leaves data protection to you, not to the provider.

For an SME the sensible choice is often a managed disaster recovery: definition of RPO and RTO per process, immutable backups, replication of the virtualisation to European datacenters and documented tests. It is also the most straightforward way to meet the continuity requirements set by NIS2 and the GDPR without building and maintaining everything in-house.

AtWorkStudio has been operating from Piacenza since 2000. We are certified ISO/IEC 27001, 27017, 27018 and ISO 9001, ACN (Italian National Cybersecurity Agency) qualified for cloud services, members of Clusit (Italian Association for Information Security) and affiliated to Confindustria Piacenza in the RICT cluster.

Sources

  • NIST SP 800-34 Rev. 1 — Contingency Planning Guide for Federal Information Systems
  • ISO 22301 — Business Continuity Management Systems
  • ENISA — Guidance on resilience and business continuity
  • Microsoft Learn — Azure Backup and Azure Site Recovery (official documentation)

Frequently asked questions

Answers to the most common questions about disaster recovery, RPO/RTO and business continuity.

How long would your business survive a shutdown?

We help you define RPO and RTO for your processes, secure your backups and build a tested disaster recovery plan — before you actually need it. Let’s talk.