Botnet Mitigation with ipset

The Internet is a hostile place. Public servers are under constant attack. The GNU & FSF servers are no exception. A large part of the task of keeping our servers online is to deal with these ongoing attacks.

DDoS Distributed Denial of Service attacks are now so strong that no single server can stand under a large attack. These attacks take down even well equipment large commercial providers. If we come under this attack then we will fall the same as others have fallen.

But much abuse from the network is not intending to take our servers offline but do so by being careless. Currently the biggest problem hitting our servers routinely are "AI" scrapers. Almost all of our data is already made public but these poorly designed web scrapers will most often scrape every URL on every page including the version control browsable pages. Whereas it would be most efficient to git clone the entire repository they scrape every version of every project in the most inefficient way. This browns out our servers.

We also get abuse from scrapers that hit completely broken URLs. When these are serving static pages these are handled fairly well with 404 returns from the web servers. But when these are from dynamic pages such as the FastCGI CGIT dynamically served pages which are rather heavy weight processes it really hits the server system resources hard. These botnets can take the entire server down due to browning them out by hammering on the cgit interface.

When this happens and a large botnet is hammering on the cgit web UI for browsing git repositories dynamic FastCGI cgit.cgi interface it pegs the load average of the system to the max number of those processes that we configure. Since we configure this to be the maximum tolerable on the system the system resource drain becomes large. All CPUs run at 100% and legitimate clients are starved out of the system.

In one case the pattern of URL the botnet was hitting was easily identifiable. In this case the pattern was a mangled impossible one with multiple project.git strings one after the other in the URL. This easily identifiable pattern was used in two ways to mitigate this attack.

Nginx Configuration

The URL could be identified within the Nginx configuration file and then using this to immediately return a 400 HTTP code (Too Many Requests) without processing the much heavier cgit.cgi process.

location /cgit/ {
        location ~ ^/cgit/.*\.git/.*\.git/.*\.git/ {
                return 400;
        }
        ... serve cgit.cgi ...

iptables and ipset

Traditionally we would use fail2ban rules to recognize and then block abusing IP addresses. This could still be useful but initially we had two problems. One is fail2ban only works with IPv4 addresses at this time and this attack included both IPv4 and IPv6. Another is that this was a huge botnet that we knew was over a million strong remote IP addresses from over the globe and that this would not work well with iptables. Additionally writing fail2ban rules is tedious.

We were aware of ipset but had not previously had experience using it. Jing provided initial documentation, motivation, and energy into using ipset and this worked well. Thanks Jing! This resulted in a simple script to extract the IP addresses that were hitting the URL pattern and then putting them into an ipset to be blocked.

I wrote a perl script to do this block. The main part of the pattern was the same as the above for the Nginx configuration. This pattern was used to build the ipset block list.

m{/cgit/.*\.git/.*\.git/.*\.git/}

In order to create the ipset table it must be created. I knew the ipset for IPv4 would be a big list and started with these initial sizes.

create cgit-bl hash:ip family inet hashsize 1048576 maxelem 2500000
create cgit-blv6 hash:ip family inet6

After the ipset is created then this was added to the iptables using the following.

iptables -w -I INPUT -m set --match-set cgit-bl src -p tcp -m multiport --dports 80,443 -j DROP
ip6tables -w -I INPUT -m set --match-set cgit-blv6 src -p tcp -m multiport --dports 80,443 -j DROP

The size of the ipset turned out to be insufficiently large enough. After running for a full day and night we hit the 2500000 maxelem limit! This required destroying the ipset and creating it again with a larger size. This was done by saving the previous ipset, increasing the size, restoring it in a larger size. To destroy the ipset it must be removed from use in the iptables first. Saving and restoring is very fast and takes only a few seconds.

ipset save > /var/tmp/ipset-cgit-bl-save
iptables -w -D INPUT -m set --match-set cgit-bl src -p tcp -m multiport --dports 80,443 -j DROP
ipset destroy cgit-bl
ip6tables -w -D INPUT -m set --match-set cgit-blv6 src -p tcp -m multiport --dports 80,443 -j DROP
ipset destroy cgit-blv6
sed 's/maxelem 2500000/maxelem 3000000/' /var/tmp/ipset-cgit-bl-save > /var/tmp/ipset-cgit-bl-save2
ipset restore < /var/tmp/ipset-cgit-bl-save2
iptables -w -I INPUT -m set --match-set cgit-bl src -p tcp -m multiport --dports 80,443 -j DROP
ip6tables -w -I INPUT -m set --match-set cgit-blv6 src -p tcp -m multiport --dports 80,443 -j DROP

At this time we know the modified larger ipset size has become this larger creation with these larger values.

ipset create cgit-bl hash:ip family inet hashsize 2097152 maxelem 3000000

The resulting ipset sizes after most of the botnet IPs have been accrued are the following.

root@vcs3:/var/log# ipset list cgit-bl | head
Name: cgit-bl
Type: hash:ip
Revision: 5
Header: family inet hashsize 2097152 maxelem 3000000 bucketsize 12 initval 0x73846ad0
Size in memory: 72030328
References: 1
Number of entries: 2635555
Members:
38.61.137.112
165.16.161.199

root@vcs3:/var/log# ipset list cgit-blv6 | head
Name: cgit-blv6
Type: hash:ip
Revision: 5
Header: family inet6 hashsize 1024 maxelem 65536 bucketsize 12 initval 0x385c59aa
Size in memory: 113104
References: 1
Number of entries: 3017
Members:
2400:cb00:413:1000:c28a:c512:99d1:be1c
2400:cb00:573:1000:e954:b34f:c108:519a

Currently it feels like we are in the tail of the botnet when all of the IPs of the fast machines have already been collected. But there is always a long thin tail of botnet systems which trickle in for a long time. At this writing the botnet is still showing us new members. Those new members are immediately being added to the ipset block list. It's astounding that this botnet is so large. Though it is possible that some of these systems are on dynamic addresses and are simply rotating to new addresses.

The result at this time is that the system is able to keep operating acceptably in spite of this very large 2.63 Million botnet strong abuse. That's mainly because this feels like a misconfigured "AI" scraper and not an actual DDoS attack. No single system would survive a DDoS from such a large botnet. But since this seems to be an incorrectly configured scraper botnet this was able to be mostly mitigated.

Colateral Damage

It turns out that many places and Brazil in particular has many ISPs that use CG-NAT or Carrier Grade NAT to Network Address Translate many subscribers into using one shared IPv4 address. This mixes together potentially many thousands of people! If there were a list of these NAT addresses we could avoid hard blocking them. After implementing this ban initially we started to get reports from people who were blocked. We quickly unblocked these reported addresses.

As a hack I implemented an allow list of sorts. Using the same script that looked for bad signatures it also looked for good signatures of successful git transactions and upong seeing these added those addresses to an allow list. Later when adding addresses to the block list I would first check to see if it was on the allow list and if so then I would avoid adding it to the allow list. This is not a perfect solution becuase it depends upon order of actions. But it does help and is easy enough so it is done.

Timeout Expirations

Hard blocking 3 million addresses avoided the initial problem but is not workable over time. It's going to be a problem after a while use of addresses move around. This needs to be dynamic and automatic for the long term or it won't work long term.

The ipset includes a timeout feature. This is required to be declared when the ipset is created. But then a timeout is applied to every entry. The entry will expire after the timeout. This is perfect!

ipset create cgit-bl hash:ip family inet hashsize 2097152 maxelem 3000000 timeout 3600
ipset add cgit-bl 192.0.2.11 timeout 86400

This above creates an ipset wth a default 3600 second (1-hour) expiration timeout. Then it adds an example IP with an 86400 second (1-day) expiration timeout. The timeout will then tick down until it reaches zero at which time it will be removed from the ipset. This feature makes it very easy to make custom fail2ban-like scripts with automatically expiring IP addresses.

Allow Listing

It's possible to use iptables to create an ipset allow list. Currently I am not using this feature. But it is possible and useful to mention. Currently i am using an ipset to hold the allow list of any address that has made a successful git transaction and if an IP is found in the allow list then it is not added to the block list. And of course after an IP is in the block list then no more interaction will happen so there is no need to have a dynamic allow list. But this is possible.

iptables -w -I INPUT -m set ! --match-set ipset-allow-list src -p tcp -g ourchain

If the address is not in the ipset then it chains to the specified ourchain which would then perform further processing such as blocking. Negating the match set is a useful feature.

Conclusion

Using ipset worked extremely well! This particular botnet abuse has been continuing past 3-weeks and is continuing. There seems to be no end for it. Using ipset to block this huge number worked more efficiently than using simple iptables which I am told becomes inefficient at some number of entries perhaps around 100k entries. Using a custom script to quickly block addresses based upon the abuse signature was much less tedious than writing fail2ban rules. I have written quite a few fail2ban rules and I find writing them to be tedious.

With the large size of this botnet though blocking by address becomes less effective. With the large number of botnet addresses it is possible for a botnet to hit so slowly that blocking by address is simply not effective.

By far the most effective tactic was adding an nginx rule to quickly return 400 and avoid passing the request through the much more heavy CGI pathway through cgit.cgi which really consumed the system. In the future having a web access firewall in front may become a requirement.