The Sitecore robot detection component
Every time a page is requested on your website, the following pipeline processor is activated:
The processor first checks that robot detection is enabled by checking the value of the
Analytics.AutoDetectBots setting in the
Sitecore.Analytics.Tracking.config file. You can disable the component by changing this setting to false.
ContactClassification class contains the classification constants and helper methods. The following helper methods take the contact classification as a parameter and return a Boolean value indicating whether the contact is a human or a robot.
Initial visitor classification
SC_ANALYTICS_GLOBAL_COOKIE contains the
IsClassificationGuessed field, which is set to true or false.
When a new visitor comes to the website, it is set to false by default because the classification of the visitor has not yet been determined. At this stage, the visitor could be a human or a robot. When the visitor classification has been determined, this field is set to true.
The visitor identification control
VisitorIdentification control is rendered on the page. It first checks whether the
VisitorIdentification.ascx control is present in the
layouts/system folder. If the control is present, the content of the
VisitorIdentification.ascx user control is rendered on the page, and it:
- Saves the current UTC time to the
- Adds a reference to the
When a visitor views a page, the
There are two events that the script subscribes to:
OnMouseMoveevent – triggered when a computer mouse is moved.
OnTouchStartevent – triggered when the screen on a tablet or mobile phone is touched.
If the computer mouse is moved or if the visitor touches the screen of a tablet or mobile phone, code is executed that requests the
VisitorIdentificationCSS.aspx page. A URL to this page is created (not a direct request). If the visitor is a robot, it is unlikely they will load this CSS stylesheet so this indicates human behavior, as a human visitor will attempt to load the stylesheet into a browser. When this happens the
VisitorIdentificationCSS.aspx page is requested, which generates an empty style sheet. This page also contains code that is executed every time a request for the page is made.
If a human visitor has caused the page to run, the code in this page makes the following changes:
- Visitor classification code is set to 0, which means the visitor is classified as human.
Current.Session.SetClassification(0, 0, true); -
IsClassificationGuessedboolean value of the cookie set to true. This means that the visitor has now been classified so the robot detection logic no longer needs to be executed
cookie.IsClassificationGuessed = true;
- The ASP.NET session timeout setting is reset back to the default for human visitors (20 minutes).
Timeout setting comparison
timeoutSleep (30000, placeCheckerRequest);
This function reads the UTC time from the
VICurrentDateTime meta tag and makes a request to the
VIChecker.aspx page sending the retrieved time in the
VIChecker.aspx page checks the difference between the current UTC time and the time in the
Tracker.Current.Session.SetClassification(925, 925. True);
The Media Request event handler
In earlier robot detection logic, if a visitor made a request to download a media item, then the visitor was identified as human. In the xDB robot detection component, this approach is not enough.
Sitecore.Analytics.Tracking.RobotDetection.config file, the following event handler enforces this:
When this event handler is loaded, it processes the tracking field of the media item but does not change the classification to human if a visitor downloads a media item.
To be able to change the classification, you need access to the session. In Sitecore, the custom media request session module (a C# class file) enables a session for requests to media items that contain something in the tracking field. If there is nothing in the tracking field, a session is not required, which in turn speeds up the processing time of the requests.