Water System Risk and Resilience

Professionals must know how to evaluate and lessen risks for their public water system clients.

By Ed Butts, PE, CPI

Unlike most private and domestic water systems, a public water system must adhere to specific rules and regulations governing the system.

Due to past and present potential threats to potable water infrastructure and public health, one of the current requirements is preparation of a Risk and Resilience Assessment. What it does is provides a community water system with a summary of the reliability, redundancy, risk assessment, and resilience of the water system along with the recommended actions.

The actions are those needed in the unlikely event of a severe emergency with water supply, public health, and safety resulting from bioterrorism, vandalism, aquifer depletion, equipment or well failure, or any naturally occurring event. These assessments are required for all community water systems that serve a population of 3300 and more.

Ensuring the continued resilience and reliability of a potable water system is a critical function of water purveyors. In the United States, communities are generally willing to accommodate short-term (on the order of a few days) disruptions in water and wastewater services resulting from man-made or natural disasters. But yet, longer-term disruptions are less tolerable.

By way of example, the Oregon Resilience Plan indicated if a business cannot reoccupy their facilities, including access to functioning water and wastewater systems, within one month, they will be forced to move or dissolve. The timeline likely varies depending on the needs of individual communities, type of business, and severity of the disaster. But it is evident that a potable water system, large or small, must be designed and operated with sufficient reliability and redundancy to avoid sustained water outages.

As many WWJ readers regularly work for municipal and other public water system clients and service their water systems that use groundwater sources, the dependency these groundwater systems place on their wells and pumping stations create a condition where reliability and in many cases redundancy of these facilities is paramount.

Although most risk and resilience assessments must be completed within the next few months of 2021, understanding the concepts and potential risks associated with a water system can benefit all of us who work in this industry.

American Water Infrastructure Act

The American Water Infrastructure Act (Public Law 115-270) was signed into law on October 23, 2018. AWIA Section 2013 requires community water systems serving more than 3300 people to develop or update risk assessments and emergency response plans (ERPs). As a defined small water system with more than 3300 but less than 50,000 people, submission of this plan was required by June 30, 2021. An updated ERP is due six months later by December 31, 2021.

A Risk and Resilience Assessment is typically prepared to comply with the American Water Works Association (AWWA) Standard J100-10, Risk Analysis and Management for Critical Asset Protection (RAMCAP).

The RAMCAP Utility Resilience Index scores water utilities on operational and financial resilience. The seven operational indicators represent a “utility’s organizational preparedness and capabilities to respond and restore critical functions/services following an incident.”

Five financial indicators represent a utility’s financial preparedness and ability to adequately respond to an incident. Each of the indicators is scored with a value from 0 to 1, and the operational and financial indices are multiplied by weighting factors and then summed. The maximum value of the index is 100 with most water systems scoring between 60 to 70. The Utility Resilience Index takes a high-level approach to measuring resilience, but it is not a totally valid systems measure as it does not account for interconnections between the indicators.

AWWA Standard J100 uses an all-hazard approach, considering factors such as malevolent threats, natural hazards, risk, and dependency and proximity threats. In addition to these threats, specific threats related to the unique nature of the water system are also analyzed. For groundwater-dependent systems, these may include probability of equipment failure based on the use of submersible or vertical turbine pumps, well failure, and long-term drought or aquifer depletion at each site.

As stated in the American Water Infrastructure Act, the Risk and Resilience Assessment should address the following minimum elements:

  • Risks to the water system including malevolent acts such as physical or cyber vandalism, sabotage, or terrorism; dependency failures including pumping equipment failure, well failure, and utility power failure; long-term source (aquifer) depletion; and natural hazards such as earthquakes, ice and snowstorms, and high-wind events
  • Resilience of the distribution system pipes and constructed conveyances; physical barriers, source water, water collection, and intake; pretreatment, treatment, storage, and distribution facilities; electronic, computer, SCADA, and other automated systems (including the security of all such systems) that are utilized by the water system
  • Monitoring practices of the water system
  • Financial infrastructure and stability of the water system
  • Use, storage, and handling of the various chemicals used by the water system
  • Operation and maintenance of the water system
  • An optional evaluation of the capital and operational needs for risk and resilience management for the water system.


Figure 1. Typical event tree for pathogen transport in aquifer.
Figure 2. Typical event tree for chemical transport in aquifer.

Defining Water System Risks

Although common risks exist with virtually all water systems, certain risks and exposures are greater for many water systems.

For example, water systems on the West Coast of the United States possess a greater risk to seismic hazards, while water systems in the states of the Gulf Coast will generally display a higher risk to hurricanes.

Risks can be applied to the entire water system or to individual facilities as with a multiple-well system. Several methods exist to determine the relative risk of each threat.

One of the most common methods uses “event trees” which are used to develop a matrix of risk based on a string of potential occurrences in progressive order to create a probability of the threat. Fundamentally, if one of the events (or branches) doesn’t occur, the ultimate event does not occur.

Figure 1 displays a typical event tree scenario for pathogen transport to a well while Figure 2 displays an event tree simulation for chemical transport to the same well. Based on an analysis of exposure, importance to the utility, and relative frequency of incidence, the typical risks for groundwater systems in no specific order of magnitude consist of the following:

  • Localized or widespread power failures
  • Single or multiple well and pump failures
  • Waterborne illness or diseases from pathogens or biotoxins
  • Structure or site intrusion or security breach
  • Control or SCADA system failure, breach, or intrusion
  • Natural disasters, specifically:
    o Snow or ice storms
    o Windstorms and hurricanes
    o Highwater/flooding events
    o Seismic hazards
  • Vandalism
  • Terrorism
  • Backflow from cross connections
  • Extended drought with aquifer depletion
  • Tsunamis
  • Volcanic eruptions and lava flow
  • Landslides
  • Wildfires
  • Chemical spills
  • Construction accidents impacting the water system.

As can be seen, the list of potential threats to a water system is long and includes many threats that would be highly unusual or unlikely for certain systems. This requires a careful appraisal of the specific threats, vulnerabilities, and the outcome or consequences faced by the water system or site in question.

Accordingly, the actual risk is determined from the following relationship:

Risk (R) is the product of the Threat (T) × Vulnerability (V) × Consequences (C) where:

T = Likelihood that the threat will or could be perpetrated or occur against the asset

V = Likelihood that the threat will damage the asset, considering the effectiveness of countermeasures

C = Economic (cost to the utility and region) and public health (injuries and deaths) impacts resulting from damage to the asset.

Seismic hazards can broadly impact water systems given that earthquakes regularly cause considerable damage to buried lifelines (e.g., water distribution systems) and above-ground facilities such as wells, reservoirs, and pumping stations with inadequate resistance.

Earthquake-induced permanent ground displacement can also cause well casing buckling, well pump shearing failure, and well or booster pump discharge piping damage. The lateral or vertical forces associated with moving or shifting ground can bend well casings and break well or booster pump discharge piping. Seismic events can also weaken or displace foundations and footings used under buildings and water storage reservoirs or buckle the walls of steel reservoirs.

In addition to seismic events, other natural or human-caused hazards can have major impacts on above-ground facilities such as water storage reservoirs and well and booster pump stations. It is therefore important to appropriately consider all identified hazards when evaluating the disaster resilience of a water system. System interdependencies (e.g., loss of commercial electrical power in a wind or ice/snowstorm) can have a significant impact on the operability and functionality of water systems.

Most groundwater systems utilize multiple types of facilities and each should be evaluated individually and by the intended use and total production percentage of the site to ascertain the pertinent threat levels, vulnerabilities, potential consequences, and potential hazard mitigation techniques. Groundwater systems generally include three or more of the following:

  • Water sources (groundwater sources extracted from deep wells)
  • Water source pumping equipment (centrifugal, submersible, or vertical turbine pumps)
  • Water storage reservoirs (ground level and elevated)
  • Booster pump stations (inline or extracting water from ground level reservoirs)
  • Water distribution system (zones or singular).

Defining Water System Resilience

Resilience, according to the base definition provided by the U.S. Environmental Protection Agency in 2015, is “the ability to anticipate, prepare for, and adapt to changing conditions and withstand and recover rapidly from disruptions.” In equation form, it is expressed as:

Resilience = R (risk) × 4Rs (control factors)

The 4Rs of resilience include: (1) resistance, (2) reliability, (3) redundancy, (4) response and recovery.

The challenge is to develop a practical metric that assesses current resilience levels in a consistent manner and drives improvement across all aspects of the service without being overly complicated and inefficient to implement.

Proper preparedness is also a large component of resilience. Resilient systems are prepared to quickly address and manage hazards with a minimal loss of functionality and system integrity. However, although communities can be prepared with emergency response plans and appropriate mitigation strategies, they may not demonstrate resilience during an actual hazard.

The first R of resilience, resistance, implies that the water system has been constructed with durable and long-lasting materials and installed using quality workmanship; the control system is resistant to intrusion or hacking; and proper security measures are present at all sites to preclude breaches or unauthorized entry.

The fourth R, response and recovery, simply refers to the water system’s ability to respond to threats and recover from one in a timely manner.

The second and third Rs, reliability and redundancy, are so important they warrant a separate discussion.

Control Factor of Reliability

The reliability of a pumping or water system is dependent on the individual reliabilities of the many components that go into each system. For example, when considering overall reliability, a water pumping system generally includes the individual reliabilities of the source of water, pump, driver, pump to driver lineshaft, cable, or coupling (if any), mechanical components, electrical system (if electrically powered), support equipment unique to the system (automatic oiler, mechanical seal, etc.), and control components.

Each pump component’s reliability is typically factored into producing an overall pumping system reliability. Unless system or component redundancy is employed, failure to any one of these components can result in an interruption of operation to the entire system.

Therefore, for the purposes of a reliability assessment, a pumping system is considered to be a series or interdependent system. The proper selection and sizing during design of each component is the first step to a dependable and functional system. However, long-term and continuing reliable operation depends on the effective initial performance testing and commissioning of each piece of equipment and the water system as a whole.

As the final phase of a construction project, proper commissioning will impact the separate issues of asset management, operational performance, contractor and manufacturer warranties, program automation, ongoing maintenance requirements, and ample operator knowledge through adequate levels of training. Planning for the implementation of commissioning activities during the design phase will ensure responsibility, appropriate advance scheduling, and performance testing is performed during the construction process.

A complete commissioning program can ensure adequate testing and training activities are conducted for successful long-term performance, operation, and asset management of the pumping plant. Proper commissioning of a pumping plant verifies the integrity of installed components and reduces the time needed and addresses potential problems during the official startup phase. This greatly impacts the reliability of the plant.

Generally, the supporting elements of transmission pipelines and distribution networks are not included in pumping system reliability that use multiple wells or pumping plants, but in the water system reliability as these components are a common element of the entire water system.

A typical water system not only includes the various components in a single pumping system, but also uses multiple pumping stations and often common sources, transmission and distribution mains, and a master controller or SCADA control system. For this reason, a water system with multiple sites or wells is usually considered as a parallel system for the purposes of a reliability assessment.

A water transmission or distribution system is generally ignored. This is because a water transmission or distribution system in reasonably good condition and less than 30 years old, constructed using high quality materials and with good installation, operation, and ongoing maintenance practices, is typically 99% or greater in component reliability.

However, a simplified methodology has been developed for the assessment of a water transmission or distribution system’s reliability. This requires an evaluation of the system, including an inventory of the size, material type, and age. This must be followed by an assessment of the current condition of its components, system looping and reinforcement, isolation valving, history of past breaks and their frequencies, and the operational aspects of exposure to water hammer, corrosion, and potential traffic or seismic damage.

These factors are further segregated into relative exposures of particularly vulnerable sections and areas. The overall reliability is then determined by an average of the worst level of exposure to risk versus the least exposure to risk.

In addition to an analytical evaluation, design guidelines are available to improve system reliability. The simulation approach is often used to calculate the water system’s overall reliability. Pumps with a failure rate exceeding three breakdowns per year and a five-day repair time should have redundant or standby capacity of at least 150%, 67%, or 25% of the rated GPM in the case of two, three, or four working pumps, respectively.

Control Factor of Redundancy

The redundancy of sources, pumping plants, water treatment and storage facilities, and other related infrastructure is factored into the system’s reliability assessment. If the target reliability cannot be achieved, then additional water source capacity and storage at the destination or in the system should be provided.

The process of water system optimization can also be applied to increase system reliability. System optimization is the process of identifying, understanding, and cost effectively eliminating unnecessary losses and risks, while at the same time reducing energy consumption and improving the reliability of pumping systems. For example, energy savings of 20% to 40% are often achievable after implementing the recommendations made through a formal pumping system assessment and efficiency study.

The first step of an assessment includes screening individual units for possible improvements. This is conducted by individually operating each pumping unit to determine the pumping plant efficiency and the degree to which the unit is performing away from its specific best efficiency point (BEP) and the expected plant efficiency.

Candidates for testing can include large or repeatedly problematic pumps such as those with high operating hours, cavitation, noise, excessive vibration, frequent on/off cycling, sustained operation from its BEP, exposure to abrasives, and higher-than-average desired energy consumption and maintenance costs.

Pumping systems with one or more of these symptoms are often ideal candidates for further assessment and possible pump and motor retrofit. Large or high-maintenance systems that are critical to water system production or facility operation are generally the highest priorities.

Pumps identified during the initial screening process go through a more thorough assessment or analysis to confirm performance or the mean time before failure (MTBF) improvement opportunities. In addition, a lifecycle cost (LCC) analysis can generate often needed financial justification for any improvements by relating the proposed pumping system improvements to a strict economics argument, rather than a technical one.

An LCC analysis can fully assess the cost of purchasing, installing, operating, maintaining, and disposing of the system’s components to provide a thorough evaluation. When documenting anticipated cost savings, local utility or regulatory rebates or incentives should also be included.

Proper pump scheduling is another method of improving system reliability as well as system and unit efficiency. The purpose of pump scheduling is to plan the operation of pumps over a specific time period or at optimum delivery pressures to efficiently meet the current water consumer demands. This is often ignored due to the perceived complexity involved in facilitating pumping unit operating changes and schedules.

Optimizing this function has proven to be a practical and highly effective method in reducing operating costs without significantly altering the actual infrastructure or reprogramming of the entire pumping or control system. Pump system scheduling should be oriented around the combined use of the most efficient units to meet current demands, but retaining enough flexibility to handle any conceivable operational scenarios or demand changes. It should also ensure the system can functionally handle all potential operating conditions, such as inserting the use of pumps with variable speeds, constant speeds, and cycling units into an operating step as well as examining potential anomalies.

Modern SCADA systems and water systems controlled by programmable logic controllers (PLCs) generally possess the flexibility to easily make these types of control setpoint revisions.

Improving Water System Reliability and Resilience

Some of the steps often used to help achieve and maintain sustainable pumping and improve water system reliability,
which will increase system resilience, include:

  • Observe proper system design and equipment selection with the goal of achieving optimum reliability.
  • Use proper material selection, workmanship, installation, and commissioning of each facility.
  • Use proper flow control of pumping units (VFD, VSS, or inline control valves) when appropriate.
  • Use proper operation of pumping units at their respective BEP matched to the system demand.
  • Perform proper maintenance on a scheduled basis of operating hours or production volume.
  • Perform stocking of or confirm rapid access to replacement parts, particularly critical-need parts.
  • Continuously monitor and track overall water system efficiency.
  • Track and document lifecycle costs and history of each water system major component.
  • When water filtration is utilized, examine the treatment train for possible system improvements or media replacement or upgrade to improve process. Reduce differential pressure to trigger backwash. Use several smaller filters rather than a single large filter vessel to provide a means of bypassing defective or disabled units at reduced flow rates to increase redundancy.
  • Establish a systematic regular pump testing program to track vibration and to verify operation and efficiency.
  • Establish an optimized system configuration and management program for seasonal flow variations.
  • Upgrade standard efficiency motors to high efficiency motors if justified by use and economics.
  • If possible, use “off” peak hours pumping period to reduce daytime electrical energy charges.
  • Match average-day demand capable booster pumps to high demand periods if adequate storage exists.
  • Regarding SCADA systems, use performance tracking of operating units to match system demands.
  • Use control systems that allow flexible programming changes to operational flow and pressure setpoints to
    permit changes to coincide with seasonal or daily demand variations.
  • Build in greater system reliability by using high reliability and redundant components.
  • For critical sites, use redundant power sources with a second utility feed or automatic-start backup generator.
  • Consider adding redundancy at critical-use facilities by the addition of backup pumps, filters, reservoirs, etc.
    System reliability is often increased by placing redundant equipment at differing sites.
  • With VTPs in sandy wells or with low static levels, consider using specialized lineshaft and bowl bearings.
  • On larger submersible installations, use low-speed (1800 RPM) motors and pumps to increase the service life.

These are just a few tips that can increase reliability, which can greatly lower the risk and increase the resilience of a water system to withstand threats.

There are many more things to do, some with a unique application to a single water system. Each water system must be evaluated for the specific risks associated within their system as well as the opportunities to lessen the risks. As water system professionals, there are many ways we can assist the water system in identifying and addressing these risks.

Until next month, work safe and smart.

Learn How to Engineer Success for Your Business
 Engineering Your Business: A series of articles serving as a guide to the groundwater business is a compilation of works from long-time Water Well Journal columnist Ed Butts, PE, CPI. Click here for more information.

Ed Butts, PE, CPI, is the chief engineer at 4B Engineering & Consulting, Salem, Oregon. He has more than 40 years of experience in the water well business, specializing in engineering and business management. He can be reached at epbpe@juno.com.