Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Halcyon

macrumors 6502
Original poster
Sep 21, 2006
335
0
Problem here.
I have a web page that is constantly exceeding its alloted monthly bandwith...unfortunately not because of welcomed visitors but from spammers and probably boots.

I have checked the statistics for the page in question and cannot really pinpoint one single IP responsible for this. I was thinking that I could block this IP but it changes constantly. The only trend I've been able to discern is that the offender will navigate to every single page in the site and will "inspect" every object that can be clicked or view (links, pictures, etc.). Being that no real person does this, it has to be some kind of robot doing it...with what purpouse? I don't know. It even "visits" linked css style sheets and PHP include files.

Any ideas or suggestions as to how to deal with this?

TIA
 
Nope, no robot.txt file yet. What I have done instead is place a meta tag that should work equally well, even though not all robots implement it yet:

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

I'll set up a robot.txt fil, implement it and see if it helps.

Thanks.
 
What kind of site are you running? I don't think any indexing bots can used up that much bandwidth, it could be spammers though. If it is really spammers, I don't think they will honor your robots.txt file. How do you gather that spammers' visits are disabling your site?
 
What kind of site are you running? I don't think any indexing bots can used up that much bandwidth, it could be spammers though. If it is really spammers, I don't think they will honor your robots.txt file. How do you gather that spammers' visits are disabling your site?

I simply don't know. Just like you mention, I don't think indexing robots will use all that bandwith (at least I've never seen or heard of that happening) so I'm assuming maybe spammers are the offenders. Then again, this particular web site has no content whatsoever that can be of interest to spammers (it's a small business/factory web site and the contents is text and some images).

I've been monitorng traffic for the past 24 hours and it seems to have returned to "normal" levels after reaching a peak of 10 to 20 times it normal high during the past weekend. I'll keep monitoring and hope for the best.

Thanks for our help.

PD - Other people I've consulted with agree with you in that robots.txt files will not stop spammers (at least not the ones that are good at it)...and that most probably it was indeed some sort of spamming that generated this. Once the spammer found nothing of interest (info, reaction, whatever...) it just moved on to another poor soul like myself :)
 
is there a stats package bundled with your hosting? have you looked through it to see what files are being requested?
 
Yes. Webalizer is the name of the package and I also run StatCounter which I find far superior for other info.

I've checked the logs and like I mentioned before they go after everything...all files, links, pictures, PHP include files, css style sheets, etc.
 
do you have that little bandwidth available, or are they grabbing a whole bunch?

i've got a bunch of stuff in my .htaccess that cuts down on fraudulent signups on a joomla site. if you don't mind blocking the entire .ru population, and stuff like that.

for example:
Code:
deny from .vn
deny from .gr 
deny from .ru
deny from .kr
deny from .ar 
deny from .bz
deny from .sg

deny from megagiga.com
deny from netzero.net 
deny from spaceproxy.com
deny from anonymizer.com
deny from drikka.net
deny from netdirect.net
deny from juno.com
deny from voyager.net
deny from globali.net
deny from the-cloak.com
deny from anonymouse.ws
deny from megaproxy.com
 
do you have that little bandwidth available, or are they grabbing a whole bunch?

i've got a bunch of stuff in my .htaccess that cuts down on fraudulent signups on a joomla site. if you don't mind blocking the entire .ru population, and stuff like that.

for example:
Code:
deny from .vn
deny from .gr 
deny from .ru
deny from .kr
deny from .ar 
deny from .bz
deny from .sg

deny from megagiga.com
deny from netzero.net 
deny from spaceproxy.com
deny from anonymizer.com
deny from drikka.net
deny from netdirect.net
deny from juno.com
deny from voyager.net
deny from globali.net
deny from the-cloak.com
deny from anonymouse.ws
deny from megaproxy.com
Where did you get this list? Is it based on your experience? I'm kind of surprised to see .sg being blocked from your site.
 
Where did you get this list? Is it based on your experience? I'm kind of surprised to see .sg being blocked from your site.
i gathered bits and pieces from the joomla security forums. i didn't generate the list based on my experience, but my experience in using it (and some other rules) was a dramatic reduction (90-95%) of fraudulent signups.

that site is for a small, chicago-based theater company, and i'm willing to pay the price of losing our singapore audience to keep my webmastering sanity.

fwiw, i did add the countries one or two at a time. surprisingly, denying .ru didn't cut down that much trouble, but adding .vn did.
 
i gathered bits and pieces from the joomla security forums. i didn't generate the list based on my experience, but my experience in using it (and some other rules) was a dramatic reduction (90-95%) of fraudulent signups.

that site is for a small, chicago-based theater company, and i'm willing to pay the price of losing our singapore audience to keep my webmastering sanity.

fwiw, i did add the countries one or two at a time. surprisingly, denying .ru didn't cut down that much trouble, but adding .vn did.
Is there any way to convince you to lose the .sg rule? :p:). (you can ignore this btw)

Anyway, to the OP, good luck to maintaining the traffic, let us know if the problems come back.
 
i think it's worth a shot: why should i restore .sg?
Just a suggestion, because I come from Singapore myself, the laws here are pretty strict, you probably have heard of the ridiculous no chewing gum rule, and the caning punishment. Recently, we just got a teenager prosecuted for leeching WIFI.... If you managed to get the IP and ISP of the spammer who originated from .sg, chances of you getting some response from the agency here are pretty high if it is really serious. Oh ya, also, they just passed a new anti-spam law this year as well.

Anyway, I will still respect your decision even if you still keep that rule. :)
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.