Skip to content
Blog Home
Articles
Category
  • Blog Home
    • Slide page
      • Articles

          • Stopping specific bots from using your monthly ShipperHQ quota
            • The Problem
              • Nginx Config
                • Wordpress modification
              • Dockerizing a MERN stack
                • Easy way to add google Ads to vuepress running on netlify
                  • Easy way to add google analytics to vuepress running on netlify

                Stopping specific bots from using your monthly ShipperHQ quota

                author iconMichael LaPancalendar iconApril 5, 2022category icon
                • Programming
                tag icon
                • programming
                • wordpress
                • nginx
                timer iconAbout 3 min

                On This Page
                • The Problem
                  • Nginx Config
                  • Wordpress modification

                # Stopping specific bots from using your monthly ShipperHQ quota

                Also prevent specific bots from accessing your website (nginx)

                # The Problem

                A co-worker of mine recently brought up an issue to me where one of our clients were exceeding their monthly ShipperHQ api quota. The wordpress site in question was exceeding their 10k api limit from ShipperHQ for 2 months in a row. The combined traffic and sales from the site did not justify hitting 10k api calls monthly. This is where I came in to see if there was a deeper issue.

                My first steps were to identify when the api call was sent to ShipperHQ on the site. After some navigating though some pages and keeping an eye on the network tab. I determined the call was being sent out when visiting the checkout.

                With that known it was time to see where the hits were coming from.

                1. My first though was to check if the shopping checkout was indexed somewhere on google. A quick google dork later site:site-url inurl:checkout I had found that there were a few entries for the shopping checkout url on google. We removed them and edited the robots.txt

                2. Ensured ShipperHQ was not on the cart page.

                3. lastly, after analyzing the logs I had found a large number of bots that did not respect the robots.txt. Becuase of this they were adding items to carts and navigating to the checkout as they crawled the site. This is where I believed the majortity of the ShipperHQ api hits were coming from. Now its time to block these bots entirly from the site.

                # Nginx Config

                To block these malicious bots from crawling the site you can do the following in your nginx conf. While this isnt perfect, as the person with the crawler can change the user_agent. It will atleast get the majortity of them

                If you need to find the user_agent just go to your nginx access.log and it will show it there. Copy and past it to this list (+restart nginx) and it will start blocking.

                Specifically for nginx we send back http code 444 for any matching user agents. Code 444 is

                CONNECTION CLOSED WITHOUT RESPONSE

                  if ($http_user_agent ~* (360Spider|80legs.com|Abonti|AcoonBot|Acunetix|adbeat_bot|AddThis.com|adidxbot|ADmantX|AhrefsBot|AngloINFO|Antelope|BaiduSpider|BeetleBot|billigerbot|binlar|bitlybot|BlackWidow|BLP_bbot|BoardReader|Bolt\ 0|BOT\ for\ JCE|Bot\ mailto\:craftbot@yahoo\.com|casper|CazoodleBot|CCBot|checkprivacy|ChinaClaw|chromeframe|Clerkbot|Cliqzbot|clshttp|CommonCrawler|comodo|CPython|crawler4j|Crawlera|CRAZYWEBCRAWLER|Curious|Custo|CWS_proxy|Default\ Browser\ 0|diavol|DigExt|Digincore|DIIbot|discobot|DISCo|DoCoMo|DotBot|Download\ Demon|DTS.Agent|EasouSpider|eCatch|ecxi|EirGrabber|Elmer|EmailCollector|EmailSiphon|EmailWolf|Exabot|ExaleadCloudView|ExpertSearchSpider|ExpertSearch|Express\ WebPictures|ExtractorPro|extract|EyeNetIE|Ezooms|F2S|FastSeek|feedfinder|FeedlyBot|FHscan|finbot|Flamingo_SearchEngine|FlappyBot|FlashGet|flicky|Flipboard|g00g1e|Genieo|genieo|GetRight|GetWeb\!|GigablastOpenSource|GozaikBot|Go\!Zilla|Go\-Ahead\-Got\-It|GrabNet|grab|Grafula|GrapeshotCrawler|GTB5|GT\:\:WWW|harvest|heritrix|HMView|HomePageBot|HTTP\:\:Lite|HTTrack|HubSpot|ia_archiver|icarus6|IDBot|id\-search|IlseBot|Image\ Stripper|Image\ Sucker|Indigonet|Indy\ Library|integromedb|InterGET|InternetSeer\.com|Internet\ Ninja|IRLbot|ISC\ Systems\ iRc\ Search\ 2\.1|jakarta|Java|JetCar|JobdiggerSpider|JOC\ Web\ Spider|Jooblebot|kanagawa|KINGSpider|kmccrew|larbin|LeechFTP|libwww|Lingewoud|LinkChecker|linkdexbot|LinksCrawler|LinksManager\.com_bot|linkwalker|LinqiaRSSBot|LivelapBot|ltx71|LubbersBot|lwp\-trivial|Mail.RU_Bot|masscan|Mass\ Downloader|maverick|Maxthon$|Mediatoolkitbot|MegaIndex|MegaIndex|megaindex|MFC_Tear_Sample|Microsoft\ URL\ Control|microsoft\.url|MIDown\ tool|miner|Missigua\ Locator|Mister\ PiX|mj12bot|Mozilla.*Indy|Mozilla.*NEWT|MSFrontPage|msnbot|Navroad|NearSite|NetAnts|netEstate|NetSpider|NetZIP|Net\ Vampire|NextGenSearchBot|nutch|Octopus|Offline\ Explorer|Offline\ Navigator|OpenindexSpider|OpenWebSpider|OrangeBot|Owlin|PageGrabber|PagesInventory|panopta|panscient\.com|Papa\ Foto|pavuk|pcBrowser|PECL\:\:HTTP|PeoplePal|Photon|PHPCrawl|planetwork|PleaseCrawl|PNAMAIN.EXE|PodcastPartyBot|prijsbest|proximic|psbot|purebot|pycurl|QuerySeekerSpider|R6_CommentReader|R6_FeedFetcher|RealDownload|ReGet|Riddler|Rippers\ 0|rogerbot|RSSingBot|rv\:1.9.1|RyzeCrawler|SafeSearch|SBIder|Scrapy|Scrapy|SeaMonkey$|search.goo.ne.jp|SearchmetricsBot|search_robot|SemrushBot|Semrush|SentiBot|SEOkicks|SeznamBot|ShowyouBot|SightupBot|SISTRIX|sitecheck\.internetseer\.com|siteexplorer.info|SiteSnagger|skygrid|Slackbot|Slurp|SmartDownload|Snoopy|Sogou|Sosospider|spaumbot|Steeler|sucker|SuperBot|Superfeedr|SuperHTTP|SurdotlyBot|Surfbot|tAkeOut|Teleport\ Pro|TinEye-bot|TinEye|Toata\ dragostea\ mea\ pentru\ diavola|Toplistbot|trendictionbot|TurnitinBot|turnit|URI\:\:Fetch|urllib|Vagabondo|Vagabondo|vikspider|VoidEYE|VoilaBot|WBSearchBot|webalta|WebAuto|WebBandit|WebCollage|WebCopier|WebFetch|WebGo\ IS|WebLeacher|WebReaper|WebSauger|Website\ eXtractor|Website\ Quester|WebStripper|WebWhacker|WebZIP|Web\ Image\ Collector|Web\ Sucker|Wells\ Search\ II|WEP\ Search|WeSEE|Wget|Widow|WinInet|woobot|woopingbot|worldwebheritage.org|Wotbox|WPScan|WWWOFFLE|WWW\-Mechanize|Xaldon\ WebSpider|XoviBot|yacybot|YandexBot|Yandex|YisouSpider|zermelo|Zeus|zh-CN|ZmEu|ZumBot|ZyBorg) ) {
                    return 444;
                  }
                
                1
                2
                3

                # Wordpress modification

                If your are unable to modify your nginx conf. Another option would be to have shipping calculation on a different page.

                Or, using jQuery, to only load the ShipperHQ calculations after the shipping address is filled out.

                Last update: 4/5/2022, 11:56:16 PM
                Contributors: Michael lapan
                Next
                Dockerizing a MERN stack
                Copyright © 2022 Michael LaPan

                This app can be installed on your PC or mobile device. This will allow this web app to look and behave like any other installed app. You will find it in your app lists and be able to pin it to your home screen, start menus or task bars. This installed web app will also be able to safely interact with other apps and your operating system.

                Description