When we say that a system is “at risk”, a simple explanation is that we don’t expect any downtime, but there is an increased risk of downtime occurring.
For example, in a “high availability” configuration, there may be two servers providing the functionality of one. Typically, one such server will be “live” and the other “standby”, ready to take over if the live server fails for some reason.
If we take the standby server down to perform some maintenance, the service is said to be “at risk” because now if the live server fails, the service itself may become unavailable.
Another example would be the reconfiguration of web server software, such as Apache. After such work, it will be necessary to reload the web server configuration, but if that fails for some reason, the web server may be unavailable. Yes, it’s possible to check the configuration before reloading it (and we do), but the acid test is the reload itself.
A final example would be recabling part of an equipment rack. We would typically carry out such work out of core business hours, and we would describe that time as “at risk” because there’s a possibility of cables that are in use being disturbed, and thus affecting services.
On almost all occasions that we have described services as being “at risk”, there has been no downtime. However, when such work is necessary we believe you should be informed that services are at risk and we will always strive to give you advance notice.