The Sitecore Experience Database (xDB) uses two different types of databases:
- A MongoDB (NoSQL) collection database that collects all captured experience information in a loosely structured way
- An SQL Server reporting database that stores information extracted from the collection database in a form suitable for reporting
By default Sitecore xDB must always keep the reporting database in sync with the collection database.
Types of processing
Sitecore xDB has a data analysis layer that can perform several different kinds of processing on data stored in the collection database:
- Aggregation – aggregation processing extracts data from the collection database, then groups and reduces it before storing the data in the reporting database for use by Sitecore reporting applications.
- Continuous update – this process always keeps the reporting database in sync with the collection database.
- Rebuilding the reporting database – this process rebuilds the entire reporting database upon request.
- Maintenance – performs routine maintenance tasks on the collection database.
In daily operation the xDB uses a single reporting database that is continuously synchronized with new traffic information from the collection database.
Continuous update of the reporting database
Continuous update of the reporting database is the processing of interactions that have just ended. Unlike the rebuild of the reporting database, this is a continuous process that starts as soon as you launch Sitecore and which keeps the reporting database up to date with the most recent interactions, so long as the xDB is running.
The continuous update of the reporting database process works as follows:
- The latest interactions are saved to the collection database.
- Interaction data is added to the processing pool for aggregation.
- An agent worker picks up the interaction from the processing pool and hands it on to an aggregator.
- The aggregator pushes the interaction through the aggregation pipeline and converts the data into a form suitable for the reporting database.
The aggregation process converts data into a form that is easier to query and that is suitable for use with Sitecore reporting applications that use SQL Server.
Once the data is converted into the correct format, it is merged into existing reporting data that is stored in the reporting database, keeping the reporting database continuously in sync with the latest interactions on your website.
Before Sitecore 8.2, Update 6, indexing processors were part of the interactions pipeline. Starting from Sitecore 8.2, Update 6, indexing processors are part of the indexInteractions pipeline. During aggregation, Sitecore executes the interactions pipeline first, then the indexInteractions pipeline.
Rebuilding the reporting database
Rebuild of the reporting database is the re-processing of interactions that have already been aggregated into the reporting database for use by Sitecore reporting applications. To ensure that the latest changes to the collection database appear in the reporting database, from time to time you need to rebuild the reporting database. When you rebuild the reporting database, its entire contents are overwritten.
This diagram shows the processing required to keep the reporting database continuously in sync with the collection database. This diagram shows the processing required to rebuild the entire reporting database on request.
In order to minimize interruptions to reporting functionality, the rebuild process works with a dedicated instance of the reporting database, called reporting secondary database (reporting.secondary).
When the rebuild process has finished, the reporting secondary database replaces the primary reporting database (reporting).
Sitecore xDB does not allow rebuilding of the primary reporting database in-place. So the reporting secondary database is only required during the rebuild process.
You only need to connect a second reporting database if you intend to perform a rebuild of the reporting database.
The rebuild process is semi-automated but requires a system administrator to attach/replace databases in SQL Server and to make modifications to the xDB configuration.
Reasons for rebuilding the reporting database:
- After using the conversion tool to populate the reporting database (secondary) with analytics information from an earlier version of Sitecore.
- After having amended information in the collection database, in order to reflect the amendments in reports when looking at older data. An example of such an amendment could be assigning channels to referring sites.
- If you have reclassified a search key word or channel, aggregated report data is not updated automatically.
- If the reporting database has been lost or is irrecoverably out of sync with the collection database, for example, due to a disaster or if the details of two contacts have been merged.
- In Sitecore reporting applications it is possible to reclassify data that has already been processed by the aggregation layer. This could cause the reporting database to become out of sync with the collection database.
Before Sitecore 8.2, Update 6, a rebuild of the Contact Segmentation Index was executed simultaneously with rebuild of Reporting DB. Indexing processors were part of the interactions pipeline. Starting from Sitecore 8.2, Update 6, all indexing processors are in the indexInteractions pipeline. It is possible to rebuild only the Reporting DB, only the Contact Segmentation Index, or both.
Important When you rebuild the reporting database, the rebuild process overwrites all existing data in the reporting database
When you rebuild the reporting database, the rebuild process overwrites all existing data in the reporting database
Time slice aggregation
Sitecore 8.2 Update 6 and later supports Time Slice Aggregation. This makes it possible to rebuild only part of the data in the collection database. You can, for example, run the rebuild process for the last three months only. To do this, you enter the SaveDateTime value of the oldest interaction you want:
After the rebuild, the rebuilding target only contains data from the Minimum SaveDateTime until the present time.