On Database Refreshes

Cover Image for On Database Refreshes

Why Refresh?

Developing processes for database refreshes has many benefits including:

  • Consistency across environments (which means PROD bugs are more likely to be replicable on lower environments)
  • Reveals data which should be moved to environment configs
  • Reveals environment specific data and URLs within your items
  • Time savings (if automation is used)
  • Can help identify areas for performance improvements
  • Can help identify problematic customizations
  • Helps get local development environments set up

The Caveman Refresh

Perhaps the quickest and easiest method of refreshing a database is to generate a content package from PROD and install it on the downstream site(s). In many cases, this is sufficient, and it's a good starting point that will help you identify problem areas and automation opportunities.

Dynamic item add filters

However, there are some areas to watch out for with the package approach:

  • Item buckets can be a problem; especially if you have custom tree structure rules. There could be a big performance hit while installing the package. You also may need to resync buckets after installing the package.
  • See if you have any custom save / update / event handlers. These can cause performance issues and lead to unexpected issues due to bulk item adding / editing / moving / etc.

Now let's discuss a more comprehensive approach to database refreshes.

Preparing for the New Process

  • Define a schedule for when the refreshes should occur
  • Identify who the users are and consider that user passwords should not be the same across environments (unless you are using SSO)
  • Identify which items may need to be updated after the refresh. Scan the database for environment-specific URLs, values, and API keys. SXA site settings items often contain environment specific values. Content such as blogs are also frequent culprits for containing environment specific URLs due to content authors not using dynamic sitecore links. You do not want environment specific URLs in most places, because it can cause confusion if users think they are viewing a staging site but click a link and are actually then viewing the live site. See the script below for an example of scanning items for a specific value. This script can be run in the Sitecore PowerShell Extensions (SPE) console. If you have a large database, you may want to modify this script to search against indexes rather than the database directly.

_38
function SearchItemsForValue {
_38
param (
_38
[string]$databaseName,
_38
[string]$searchValue
_38
)
_38
_38
$rootItem = Get-Item -Path "${databaseName}:\sitecore"
_38
_38
Write-Host "Items in $($databaseName) database with fields containing '$searchValue'"
_38
Write-Host "Starting at root item $($rootItem.Paths.FullPath)"
_38
Write-Host ""
_38
_38
# Recursively scan items and their fields
_38
$itemsWithSpecifiedValue = @()
_38
_38
$rootItem.Axes.GetDescendants() | ForEach-Object {
_38
$item = $_
_38
_38
# Check each field in the item
_38
$item.Fields.ReadAll()
_38
foreach ($field in $item.Fields) {
_38
# Skip standard values and __ fields
_38
if ($field.Value -ne $null -and $field.Name -notlike "__*") {
_38
if ($field.Value.Contains($searchValue)) {
_38
$itemsWithSpecifiedValue += $item
_38
Write-Host "Item: $($item.Paths.FullPath) | Field: $($field.Name) | Value: $($field.Value)"
_38
Write-Host ""
_38
}
_38
}
_38
}
_38
}
_38
_38
return $itemsWithSpecifiedValue
_38
}
_38
_38
$searchValue = "YOUR PROD ENVIRONMENT URL GOES HERE"
_38
$masterItems = SearchItemsForValue -databaseName "master" -searchValue $searchValue
_38
#$coreItems = SearchItemsForValue -databaseName "core" -searchValue $searchValue

Once you've done your initial preparation, consider these pre and post-refresh steps.

Pre-refresh Steps

General Steps

  • Ensure there aren't pending changes or test items on the downstream environment. It's always a good idea to fully back up the database prior to refreshing.
  • Schedule a blackout period in which content authors should not attempt to access the site.

Database Backup and Restoration

Understand that exporting databases can result in a DTU hit, which can have performance implications and incur additional charges. You can temporarily scale up your databases while performing such operations.

Ideally, you want to base your refreshes on a point in time backup, else data may still be changing while you are exporting.

Post-refresh steps

Connection Strings and Database Credentials

Adjust your connection string names as necessary, depending on the names of your newly created backup DBs. You will also need to ensure that the databases are accessible with the user credentials you specify (ex. masteruser, coreuser, webuser). The easiest way to do this on downstream environments is to use your sa (system admin) credentials when connecting to your master, core, and web databases. This is, less secure than having individual users for each database, but it's a fast and effective way of avoiding authentication / permissions issues.

App Pool Recycle

Make sure to restart all of your sites after you restore the DB and connect to it. If you don't, the refresh may appear to have been unsuccessful (data may appear unchanged).

Database Cleanup

After restoring the databases, and prior to reindexing, you will want to clear the event and publish queues as well as property data which doesn't need to be retained during the pulldown. Not doing this cleanup can result in performance issues. The queries below can either be run in the Azure portal query editor or by connecting to the DBs via SSMS:


_11
# Run this on core, master, web dbs
_11
TRUNCATE TABLE [dbo].[EventQueue]
_11
TRUNCATE TABLE [dbo].[PublishQueue]
_11
TRUNCATE TABLE [dbo].[Properties]
_11
_11
# Run on web db (IF you have enabled this)
_11
# https://doc.sitecore.com/xp/en/developers/90/platform-administration-and-architecture/the-history-table-and-the-history-engine.html
_11
# The history engine saves information about changes made to items in the History table. All Sitecore content databases contain this table. The engine logs the item change data based on Sitecore events. It subscribes to events when it is initialized it and registers its own handlers that track item history.
_11
# Also note that you may already have a cleanup task enabled, so check your ShowConfig for:
_11
# <agent type="Sitecore.Tasks.CleanupHistory" method="Run" interval="04:00:00" patch:source="Sitecore.Services.GraphQL.Serialization.config"/>
_11
TRUNCATE TABLE [dbo].[History]

If database access is an issue, the publish and event queues can also be cleared on these admin pages:

  • /sitecore/admin/PublishQueueStats.aspx
  • /sitecore/admin/EventQueueStats.aspx

Run the Content Update Script

If you identified items with environment-specific URLs or values, you'll need to update those items after the database refresh. You can do this manually, but it's best to automate this process. The following script is an example of how you can do this using Sitecore PowerShell Extensions (SPE). This script will prompt you to select an environment, and then it will update the specified items with the appropriate values. You can modify this script to suit your needs.


_99
$options = @{
_99
"DEV"="DEV"
_99
"STAGE"="STAGE"
_99
}
_99
_99
$props = @{
_99
Parameters = @(
_99
@{Name="selectedOption"; Title="Choose an option"; Options=$options}
_99
)
_99
Title = "Environment selector"
_99
Description = "Restore values for which environment?"
_99
Width = 300
_99
Height = 300
_99
ShowHints = $false
_99
}
_99
_99
$result = Read-Variable @props
_99
_99
if($result -ne "ok")
_99
{
_99
Exit
_99
}
_99
_99
Write-Host "Selected environment: $($selectedOption)"
_99
Write-Host ""
_99
_99
$isDev = $selectedOption -eq "DEV"
_99
_99
$Restorable = @(
_99
@{
_99
"fPath"="master://sitecore/content/ACME/MySite/Settings";
_99
"field"="ServerSideRenderingEngineEndpointUrl";
_99
"updatedVal"=
_99
if ($isDev)
_99
{"https://dev.acme.com/api/editing/render"}
_99
else
_99
{"https://stage.acme.com/api/editing/render"};
_99
},
_99
@{
_99
"fPath"="master://sitecore/content/ACME/MySite/Settings";
_99
"field"="ServerSideRenderingEngineApplicationUrl";
_99
"updatedVal"=
_99
if ($isDev)
_99
{"https://dev.acme.com/"}
_99
else
_99
{"https://stage.acme.com/"};
_99
},
_99
_99
@{
_99
"fPath"="master://sitecore/content/ACME/MySite/Settings/Site Grouping/MySite";
_99
"field"="TargetHostName";
_99
"updatedVal"=
_99
if ($isDev)
_99
{"dev.acme.com"}
_99
else
_99
{"stage.acme.com"};
_99
},
_99
@{
_99
"fPath"="master://sitecore/content/ACME/MySite/Settings/Site Grouping/MySite";
_99
"field"="HostName";
_99
"updatedVal"=
_99
if ($isDev)
_99
{"dev-content.acme.com"}
_99
else
_99
{"stage-content.acme.com"};
_99
})
_99
_99
foreach ($restore in $Restorable){
_99
$item = Get-Item -Path $restore.fPath -language en
_99
_99
Write-Host "Updating item: $($item.ID) $($restore.fPath)"
_99
Write-Host "Updating field: $($restore.field)"
_99
Write-Host "--"
_99
Write-Host "From:"
_99
Write-Host "$($item[$($restore.field)])"
_99
Write-Host ""
_99
_99
$item.Editing.BeginEdit()
_99
$item["$($restore.field)"] = "$($restore.updatedVal)"
_99
$item.Editing.EndEdit() | Out-Null
_99
_99
Write-Host "To:"
_99
Write-Host "$($item[$($restore.field)])"
_99
_99
Write-Host ""
_99
Write-Host "Publishing item..."
_99
_99
$languages = [Sitecore.Data.Managers.LanguageManager]::GetLanguages($item.Database) | ForEach-Object { $_.Name }
_99
_99
$publishParameters = @{
_99
Item = $item
_99
Target = "web"
_99
Language = $languages
_99
}
_99
_99
Publish-Item @publishParameters
_99
_99
Write-Host "----------------------------------------"
_99
}

Reindexing

You need to reindex after restoring a database, else the indexes will be out of sync with the restored database. Downstream environments may be underpowered compared to PROD, so if reindexing takes too long or consistency fails, you may need to scale up the environment resources (especially when using App Services).

Now you just need to refresh. A good way to speed up this process with a more surgical approach is this solution by Ben Golden (https://bgolden.digital/2017/01/21/re-index-tree-for-selected-index-with-spe/):

This script lets you select one index to refresh for the selected subtree. Note that it just starts the job and closes the window. It does not monitor the job's progress. You could add this if you desire or simply use the jobs admin page. The script needs to be saved under /sitecore/system/Modules/PowerShell/Script Library/Content Editor/Context Menu after you have enabled the context menu integration point in the module.


_22
$item = Get-Item .
_22
Write-Host "Current item: $($item.Paths.FullPath)"
_22
$options = @{}
_22
[Sitecore.ContentSearch.ContentSearchManager]::Indexes | Foreach-Object { $options.Add($_.Name, $_.Name) }
_22
$props = @{
_22
Parameters = @(
_22
@{Name="indexName"; Title="Choose an index"; Options=$options; Tooltip="Choose one."}
_22
)
_22
Title = "Index selector"
_22
Description = "Choose an index."
_22
Width = 300
_22
Height = 300
_22
ShowHints = $true
_22
}
_22
$result = Read-Variable @props
_22
if ($result -eq "ok") {
_22
$index = [Sitecore.ContentSearch.ContentSearchManager]::GetIndex($indexName)
_22
[Sitecore.ContentSearch.Maintenance.IndexCustodian]::Refresh($index, [Sitecore.ContentSearch.SitecoreIndexableItem]$item)
_22
} else {
_22
Write-Host "User cancelled. Exiting."
_22
}
_22
Close-Window

Restore Users

If you are not using SSO for authentication, remember that restoring the core database will also restore user credentials from PROD. You'll need to update these credentials to match the credentials used in the environment you are restoring to, and your passwords should be different across environments.

Cleanup

Remember to delete any database exports, backups, or databases that are no longer needed.

KEEP BUILDING,

MG


More Stories