In the last few years, more and more public and private networks have cometo rely on cloud environments and virtualization to provide service whilemeeting their SLA commitments. One attractive property of the cloud isits support for rapid elasticity – the ability to scale the number of virtualmachines up and down according to the load, which can often be configuredto occur automatically, according to customer-set thresholds.The main purpose of auto-scaling mechanism is to cope with changes inthe traffic load.
While it is mainly used to cope with predictable changes thatarise from known characteristics of the service (i.e., peak hours), it is alsorecommended as a remedy to cope with unpredictable loads that may arisefrom flash crowds or even malicious Distributed Denial of Service (DDoS)attacks. Amazon lists the auto-scaling mechanism as one of the best practicesfor dealing with Distributed Denial of Service (DDoS) 4.In DDoS, an attacker overwhelms the victim with bogus traffic, block-ing the service from legitimate users. With a cloud-based operation, theauto-scaling mechanism ensures that a victim can cope with an attack byproviding additional resources for handling the extra traffic. This solution,however, comes with an economic penalty, termed Economic Denial of Sus-tainability attacks (EDoS). The victim needs to pay the cloud infrastructureprovider for the extra resources required to process the bogus traffic, re-sources which provide no real benefit to the victim.
However, it is assumedthat the auto-scaling is a good enough solution, since it ensures that theservice will continue to run with good performance. Moreover, the economicdamage is not a real deterrent for the victim, since the alternative remedies,such as DDoS scrubbers middleboxes or DDoS scrubber cloud services 6, 24 also cost the victim money. In this paper, we show that contrary to the common belief a shrewdattacker, can cause substantial performance damage, up to repeated episodes5of total denial of service, on top of the economic damage. We analyze, theYo-Yo attack, where the attacker sends periodic bursts of overload, thuscausing the auto-scaling mechanism to oscillate between scale-up and scale-down. During the repetitive scale-up process, which takes usually up to afew minutes, the cloud service suffers from substantial performance penalty.
When the scale-up process is finished, the attacker stops sending traffic, andwaits for the scale-down process to start. When the scale-down ends, theattacker begins the attack again, and so on. Overall the cloud service willsuffer a substantial performance penalty for almost half the duration of theattack. Moreover, when the scale-up process ends there are extra machines- but no extra traffic.
Thus the victim pays for the machines in vain. Noticethat these short bursts, are harder to detect, and also reduce the cost of theattack to the attacker.We demonstrate the Yo-Yo attack on Amazon’s cloud service under dif-ferent configurations and analyze the damages. The attack requires inferringthe state of the auto-scaling mechanism of the victim (i.e., whether the sys-tem is in the middle of scale-up or scale-down). We show that it is feasiblefor the attacker to detect the state of the auto-scaling mechanism by sendingprobe packets that measure the response time. As we know, we are the firstto analyze such attacks deeply and our work helps to explain the recentlyreported behavior of attacks which come in repeated waves.
Auto-scaling mechanisms employ one of two common policy types: Thefirst is the discrete scale policy, which adds one or a few machines at a time,checks whether the problem has been resolved, and if not continues to addmachines iteratively. The second is the adaptive scale policy, which tries toestimate the number of machines required to cope with the load of the trafficand adds them at once. We model the Yo-Yo attack under the two polices.In the discrete scale policy, we show that if the burst in the load is up to kmore than the original load, the victim will pay for approximately k 2 moremachines, and an average extra load on machine will be logarithmic in k.
Inthe adaptive model, on the other hand we show that the economic damageand the extra load are linear with k. In a representative use case this istranslated to a requirement of extra k 2 machines and the average extra loadwill be k 2 . Thus, while under the adaptive auto-scaling policy the system isguaranteed to adapt to the extra load in shorter time, this policy is shownto be more vulnerable than the discrete policy.We then discuss various auto-scaling parameters and their influence on he damage from the Yo-Yo attack.
We show that, auto-scaling is a notbullet proof remedy against variations of DDoS attacks such as the Yo-Yoattack, and that DDoS mitigation is a crucial component also in the cloud.The remainder of this paper is organized as follows. Section 2 describes6cloud scaling characteristics and existing attacks in the cloud area. Section 3presents the Yo-Yo attack in detail.
Section 4 introduces the related work. InSection 5 we model the attack, analyze it mathematically and compare it tothe DDoS attack. In Section 6 we evaluate the Yo-Yo attack and assess theimpact of our attack on a real cloud service infrastructure. In Section 7 wediscuss possible defense strategies. In Section 8 we present our conclusions.