Wednesday, July 22, 2020

Approaches to Dockerizing Existing Sitecore Solutions for Local Development


As a developer at a digital agency working in Managed Services, I work with multiple customers spanning multiple versions of Sitecore. The client sites, more often than not, are inherited from vendors outside of reach - each with a unique set of onboarding steps and requirements.

Even with well-defined onboarding steps for local solution setups, depending on the complexity of the solution and local infrastructure required, this process could take new developers days to complete.

In terms of software, we may be required to maintain multiple versions of SQL server or Solr to accommodate the demands of the varying versions of Sitecore. Common tasks necessary to set up include, but are not limited to, database backup files to download and restore complex Sitecore installations to run, host binding updates, configuration changes, module installations, publishing profile setup, and configuration transforms. Just when you think you're finally done, you've got conflicting binaries, or misconfiguration causing YSOD.

I've seen developer's local environments break mid-day, stopping them in their tracks for hours. What about a scenario where a developer is only temporarily joining a team for coverage - or, if a developer simply needs a sandbox of a functioning client site to verify or execute a proof-of-concept?  The time it takes to onboard their local environment can be daunting. 

Of course, creative solutions to this have been developed and tried with varying results (pre-built VM with Visual Studio and a site on a USB drive, anyone?). Until, Docker.

Many modern Sitecore developers hare actively including Docker into their workflow for new development projects using the latest version of Sitecore (9.3 at the time of this post). These startup projects heavily rely on item serialization techniques (TDS, Unicorn, etc.) to deploy onto clean Sitecore images. Existing solutions, unfortunately, may not be able to benefit from this modern developer experience.

That doesn't mean we can't adjust to make Docker a viable option (or at least as a supplement). It's no question now that you will likely be working with Docker in some capacity as the Sitecore development landscape continues to evolve.

If you're like me, you've been "playing around" with Docker for many months now.  After spending time understanding the patterns and mechanics behind the Sitecore Docker Images repository, it becomes increasingly clear how Docker can be used in a real-life scenario.

The challenge, of course, is that the Sitecore Docker Images repository assumes your solution is using a serialization mechanism like TDS or Unicorn that will update the database will all base-level templates, layouts, etc.  While this is most often true when building Sitecore solutions from the ground up, it is less likely true for existing solutions that are already on a maintenance model.  

To take an existing solution and "Dockerize" it, we'll want to account for some site-specific customizations.  This includes:
1) Custom SQL Databases (Master, Web, and Core - at minimum)
2) Custom/Prebuilt Solr Cores 
3) Baseline files that make up the website (assuming you have a working local instance already)
4) Custom hostnames that can map to the containers. 

We'll rely heavily on the Sitecore Docker Images repository as a baseline example, following existing patterns while customizing the build.json, Dockerfile, and script files to include new processes that don't already exist. 
It's important to note that the Sitecore Docker Images repository changes frequently. Any examples I use are subject to differences in structure and content, but the concepts should generally remain the same.  

Goals


We'll review common Docker image customizations with repeatable patterns, which I've documented while creating unique Docker images for consumption by developers on a client project.

We'll also identify ways we can improve a developers' local environment onboarding speed without the need for extensive knowledge about container technology. Installing Sitecore locally would no longer be a requirement.

I'll share some self-descriptive PowerShell scripts in which any developer who has the client solution pulled from git (and Docker prerequisites installed) would be able to execute to spin up their environment.  Developers can expect a functional local copy of a unique Sitecore customer's website with full Remote Debugging capabilities (Visual Studio 2017 and 2019, depending on each solution's specific requirements).

I consider this to be a viable supplement (or full replacement) to traditional local environments - allowing you to spin up a local, debuggable copy of a client site in much less time than it would take to install and configure Sitecore locally.

This Example

For this exercise, I'll be using a real client site running on Sitecore 9.0.2.  To keep things simple and lightweight, we'll focus on setting up an XM topology over an XP topology with our baseline website files, custom Master, Web, and Core databases, and custom Solr cores. I'll assume you have a local instance of the site you want to Dockerize.  

Common Tools

Other Prerequisites


Let's start by creating a directory called 'Docker' at the same level as our solution's project directory.  Within this folder, we'll create a 'setup' directory and a 'deploy' directory.  The 'setup' directory will contain a smaller, less beefy version of the Sitecore Docker Images repository containing only what we need for the project.  The 'deploy' directory will act as the target directory for publishing the solution from Visual Studio (when the containers are running, a file watcher transfers the files from the 'deploy' directory to the 'inetpub' directory running in the container).  

The Sitecore Docker Images repository is chock-full of folders and files which allow users to build out clean Sitecore Docker images in various topologies.  Since our customized images won't require a majority for these directories, we can take only what we need to give us a starting point. 

In our case, we'll take the following folders and files from the cloned repository and place them into our 'setup' directory:
  • The modules folder (contains the SitecoreImageBuilder PowerShell module)
  • The windows folder 
  • Build.ps1
  • sitecore-packages.json



Inside the 'windows' folder, we'll find a series of folders, two of which stand out among the rest:
dependencies and tests
 
The 'dependencies' folder contains conventional images used across a majority of Sitecore flavors, so we'll just keep that where it is.  

The 'windows/tests' folder contains some crucial files we'll need to compose Docker up and down.  We'll copy the following files into the 'Docker' directory :
  • \windows\tests\9.x.x\.gitignore
  • \windows\tests\9.2.x\Clean-Data.ps1
  • \windows\tests\9.2.x\docker-compose.xm.yml 
    • Rename this to docker-compose.yml
  • \windows\tests\9.2.x\.env
👇

The remaining folders in the windows directory might seem confusing at first, so let's trim it up by removing all folders except the ones that pertain to version 9.0.2:


Let's take a look at the '9.0.2' directory:

The two folders here represent two self-descriptive images; 'sitecore-assets' and 'sitecore-xm' - both of which we will keep as a base.  The 'sitecore-assets' image is used to store entry point PowerShell scripts (all of which are located in the inner tools folder) and referenced in the images' Dockerfile.  You can think of it as a repository of assets that exist solely to store files that will be transferred to other images during the image building process.

The 'sitecore-xm' folder contains a Dockerfile and build.json (among a few other folders/files), which together defines the XM CM and CD images.

Within the 9.0.x directory, we'll find a lone 'sitecore-xm-solr' folder.  This folder - and the nested files within it - will be used to build our Solr instance.

Finally, in the 9.x.x directory, we'll find a series of folders that contain various available image topologies.  We can remove all folders except for 'sitecore-xm-sqldev':


To eliminate complexity for our folder structure, we'll copy over each inner folder (those in 9.0.2, 9.0.x, and 9.x.x) directly into the 'windows' folder (IMO: consolidating just makes it easier to navigate). This is optional, and really about preference.

The final folder structure looks like this:
  


As mentioned in the Preface, there are three specific assets we want to share with other Dockerfiles:

Databases

First: site-specific Master, Web, and Core databases.  Most of the websites I manage are hosted in Azure, and as a standard practice, database backups are periodically taken and stored in an Azure Storage Account.  These databases backups files are formated using '.bacpac.' 

If you dig into the existing Sitecore Docker Images Dockerfiles, you'll find that they utilize '.scwdp.zip' packages that contain everything needed to install a specific version Sitecore (the same files used during the days of Sitecore Installation Framework) across the several images that make up a Sitecore website.  The '.scwdp.zip' files contain '.dacpac' files that are used to deploy clean Sitecore Databases in the 'sitecore-xm-sqldev' image.  

By including '.bacpac' files, we'll be overriding the default behavior to upload and import our custom databases instead.  


Solr Cores

Second: custom or prebuilt Solr Cores to ensure Search capabilities work for whoever happens to grab the images and run the containers - without requiring index rebuild.  

This one may vary depending on your solution, but Solr is most commonly present in some capacity.  Whether you use the default Sitecore indexes for search or have custom Solr indexes, we can inject the pre-built indexes into the 'sitecore-xm-solr' image.  I admit now that this may be a bit overkill since a first-time index rebuild is a super common practice when setting up a local Sitecore environment. However, it is possible, and I've done it, so I'll share it anyway.   

In my case, my local Solr indexes can be packaged from my machines \server\solr location into a .zip file called 'CustomSolrCores.zip.'


Baseline Website Files

Third: baseline files that make up the website.  I found that publishing the solution to a folder from Visual Studio is a good starting point.  The file repository should exclude the ConnectionString.config as it will be generated to appropriately point to the appropriate Docker Solr and SQL containers in the 'sitecore-xm' image. 

This folder should also include any specific files that are not in source control - like Sitecore modules you may have installed that require particular files.  For example, if you have Sitecore PowerShell Extensions installed, you'll want to include the data from the 'sitecore modules' folder (the Core database we're uploading will have everything else we need for the module to function).  


Assets Image Customization

The 'sitecore-assets' folder we obtained in Part I can be used for this. In this folder, we'll create a folder called 'custom':


All custom assets can be copied into the 'custom 'folder:

  • XXXXX-master-db.bacpac
  • XXXXX-web-db.bacpac
  • XXXXX-core-db.bacpac
  • CustomSolrCores.zip
  • CustomBaselineFiles.zip

Note: I don't recommend zipping the  .bacpac files up due to Windows extraction size limitations I've experienced first-hand.

Customize the build.json 

The build.json file customizations are pretty limited in the 'sitecore-assets' folder.  We simply want to ensure the tag and source parameters are corrected for the Sitecore version we're setting this up for. 


Customize the Dockerfile 

We want to load the assets in the 'custom' folder into the 'sitecore-assets' image.  This can be done by adding a new COPY instruction in the existing '#copy local assets' section:



This command copies the folder into the c:\custom location of the Docker context during the build.
 
Near the end of the Dockerfile, we'll include another COPY instruction:


 The 'sitecore-assets' image now contains our custom .bacpac database files, which can be referenced and utilized on all other Dockerfiles - including 'sitecore-xm-sqldev.'  

Customize the build.json

The build.json configuration file for the 'sitecore-xm-sqldev' image contains several version-specific tags by default. 

 

To simplify this for our solution, we can remove all but the 9.0.2 tag.  We'll also include three additional build arguments which serve as path references to each custom database file within the C:\custom folder in the 'sitecore-assets' image. 

 


Customize the Dockerfile

In the default Dockerfile, we can see how the .scwdp.zip file (containing all Sitecore installation files for 9.0.2) is consumed from the 'sitecore-assets' image.

The ASSETS_IMAGE argument in the build.json file is referenced at the top using an ARG instruction. A FROM instruction specifies the 'sitecore-assets' image as a 'Parent Image.' An INSTALL_PATH environment variable is set to a C:/install folder in the 'sitecore-xm-sqldev' image. Finally, a COPY instruction copies the file from the 'sitecore-assets' to the defined INSTALL_PATH location on the 'sitecore-xm-sqldev' image. 

 

This file is then further processed (extraction by Extract-Database.ps1 and installation by Install-Database.ps1) using PowerShell commands via the RUN instruction.

We can follow these same patterns to copy our three custom database .bacpac files from the 'sitecore-assets' image to the 'sitecore-xm-sqldev' image:


Customize Database Installation Script

The extraction and installation PowerShell scripts accept the INSTALL_PATH variable where the .scwdp.zip Sitecore installation file and our three custom .bacpac files now reside. Since we don't need to extract anything new, we can skip any customizations to Extract-Databases.ps1

The default Install-Databases.ps1 script is responsible for executing a SqlPackage.exe command for each available .dacpac file (extracted from the .scwdp.zip file) in the $InstallPath location:


Our updates should execute before the highlighted Get-ChildItem block above.  

First, we'll add the following after the Push-Location -Path $InstallPath command which simply enables the contained database authentication on SQL Server (this is a common step when installing Sitecore on a local environment):


Add a new Get-ChildItem block that filters and iterates through each .bacpac file type in the $InstallPath location.  We'll use explicitly defined match conditions to map Master, Web, and Core database backup file names to their respective database names, which they will be installed as (e.g. 'website-master-db.bacpac will be installed with the name Sitecore.Master).  


To 'import' the .bacpac file, our SqlPackage.exe command will use a:/Import argument instead of the a:/Publish used for .dacpac files.


As a standard practice, we'll reset the admin user credentials to admin/b.


When we put that all together, the new script block looks like this:


Since these updates will result in the Sitecore.Master, Sitecore.Web and Sitecore.Core database names to be in use in SQL Server, we will pipe in a Where-Object cmdlet to the existing Get-ChildItem cmdlet that processes the .dacpac files to exclude the default Sitecore databases - effectively skipping the default installation behavior:

  
This is optional.  The original thought was to have components that rely on Solr search indexes, custom or otherwise, fully functional without an initial index rebuild. If you prefer this to be a manual follow-up step, that's totally cool, too.   

If you proceed with this approach, we can take similar steps taken during the SQL image customization by setting up the build.json configuration and customizing the Dockerfile to copy, extract, and include our custom Solr Core data from the CustomSolrCores.zip from assets image.

Customize the build.json 

The build.json configuration file for the 'sitecore-xm-solr' image already only contains one version-specific tag definition by default. 


Since we plan to utilize the 'sitecore-assets' image, we'll need to include a build argument for the ASSETS_IMAGE tag name, and an ASSETS_CUSTOM_SOLR_CORES argument to the CustomSolrCores.zip file location in the 'sitecore-assets' container.


I've changed the SOLR_VERSION parameter to 6.6.2 (this is the version used by this particular site) and, to ensure there aren't any discrepancies in the schema which could cause issues rebuilding the index, I've copied the contents of the managed-schema file from my local Solr 6.2.2 C:\solr\solr-6.6.2\server\solr\configsets\basic_configs\conf to the \Docker\setup\windows\sitecore-xm-solr\managed-schema.default file.

Customize the Dockerfile 

This image references a 'Build Image' - which means a container will be temporarily spun up during the build process to store folders and files that will be simply copied to the final Solr image. We'll include a FROM instruction and a new ARG instruction to reference the 'sitecore-assets' image - which container our CustomSolrCores.zip file.  Then we can use the COPY instruction to pull the CustomSolrCores.zip file into the C:\custom folder of the running container - followed by a RUN instruction using the Expand-Archive PowerShell command to extract the contents to the C:\customcores folder in the running container.  

A Copy-Item cmdlet can be added after the standard processes that are already in place to install Solr to add our custom Solr Core data over the fresh installation.  Finally, we execute a COPY instruction to transfer the folders from the Build Image to the Solr image. 





Customize the build.json

The build.json configuration file for the 'sitecore-xm' image contains two version-specific tags by default - one for Content Management and another for Content Delivery.


We'll also include one additional ASSETS_BASELINEFILES_WDP build argument, which serves as path references the BaselineFiles.zip file within the C:\custom folder in the 'sitecore-assets' image.  We'll also remove the \\XM\\ portion of the ASSETS_USE_WDP build argument for the standard installation files to resolve correctly.



Customize the Dockerfile

The 'sitecore-xm' Dockerfile has quite a bit to it as its primary objective is to extract the Sitecore files and setup IIS on the container.

 
The same patterns used for customizing the SQL and Solr images still apply for adding in our custom file from the BaselineFiles.zip.

We'll add a new argument after ARG SC_ROLE_CONFIG_DIRECTORY



Add a new COPY instruction after COPY --from=assets ["${ASSETS_USE_WDP}", "C:\\temp\\packages\\"]



Add a new RUN instruction after RUN Expand-Archive -Path 'C:\\temp\\packages\\*.zip' -DestinationPath 'C:\\temp'; `



With this project, there are extensive IIS Rewrite rules required for the site to operate correctly. By default, the rules didn't seem to function. To mitigate this (and this, of course, is optional), at the end of the Dockerfile, I've added a new process that downloads and installs the IIS Rewrite Module (via PowerShell) onto the container, then execute a subsequent iisreset command to ensure the module is properly installed and enabled during the "boot up" process.


The final version of the Dockerfile looks like this:

The following might be reserved to be executed by an architect responsible for managing the Docker Images.

Building the images for this solution is really no different than the build process used with the Sitecore Docker Images repository.  To build your images locally, you may run something like this:


If you've set up a private container repository in Azure:


Tip: When building your images, especially if you're making customizations, you'll likely run into errors - which you'll resolve image by image, and while re-running the build command.  Use the -SkipExistingImage parameter to skip existing image builds to save time.

Once we've built our images, we need to configure our docker-compose.yml and .env files, which will be used to compose our containers based on our images.

.env

The .env file we copied in the Docker Build Folder Setup section will serve as environment-specific parameters that will be used by the docker-compose.yml file.  The content in the file we copied from the Sitecore Docker Images repository is minimal:
 

There aren't many limitations to what we can include in our .env file. One of the first things we'll change is the SITECORE_VERSION entry to match our 9.0.2 version. We can also remove the LICENSE_PATH as we will replace it with a custom parameter that defines the full file path.

Here are some non-standard parameters I've included - which we'll use in the docker-compose.yml file and our custom start-up scripts:

  • SITECORE_LICENSE=C:\license\license.xml
    This will be a direct path to a valid Sitecore license on our local machine.
  • SQL_SA_PASSWORD=P@ssw0rd
    We'll replace the default SQL password with whatever we want here. I recommend doing a full find-and-replace in the setup folder to ensure the SQL login passwords match.
  • HOST_NAME=clienthostname.local
    This will hold a custom hostname, which we'll consume in the docker-compose.yml file.
  • HOST_NAME_2=clienthostname2.local
    This will hold a secondary custom hostname, which we'll consume in the docker-compose.yml file.
  • VS_VERSION=2017
    Depending on the project's Visual Studio version (2017 without native Docker debugging support versus 2019 with native Docker container debugging support), we will want to serve different "how to debug" messages in our custom start-up scripts.


docker-compose.yml

The compose file should, at a minimum, define the applications (SQL, Solr, CM, CD (optional) to be composed up. There are a few customizations I've found to be useful that may not be present in the sample docker-compose.yml files in the Sitecore Docker Images repository.

Volumes

Adding volume definitions for the SQL and Solr instance allows for some level data state storage between composing up and down. To do that, something like this can be added to the bottom of the file:


Then, we can define the volume on the SQL and Solr nodes:


Finally, we'll make a few adjustments to our CM's volume definition to include a reference to our relative deploy folder path, license file path, and remote debugger (varies based on Visual Studio version)


Windows Hosts Writer

Rob Ahnemann's Windows Hosts Writer image is quite handy for applying custom domains for the running containers as it monitors the containers on the Docker network and updates your local C:\Windows\System32\drivers\etc\hosts file. We can add the following node after the CM or CD definition:


We can also reference the custom aliases we had defined in the .env file as a network node on the CM (and/or CD) definition(s):

Full docker-compose.yml


We can treat our Docker environment like any other local instance. Many of my projects utilize XML transformations for transforming configuration files, so we can start by creating a DOCKER build configuration - which can be used by any developer utilizing the Docker environment approach:

We should also create configuration transform files based on our DOCKER build configuration - such as the ConnectionStrings.config - to match our environment's settings in the build (ensuring we don't overwrite contents our containers rely on by default).

Once our Build Configuration and corresponding transforms are set, we can create Publish Profile which will point to a relative /deploy folder located in our Docker folder:

Enable Containers View in Visual Studio 2019

Visual Studio 2019 has really promising container support.  To enable these features, got to
View > Other Windows > Containers
The Containers window provides insight into each container including a file browser and log stream for each:

Debugging in Visual Studio 2019

  1. Launch the `Attach to Process` window 


  2. Select `Docker` for `Connection type` and click `Find...`. Select the `CM` or `CD` instances and click `OK.` 


  3. Ensure `Attach to` setting is set to `Managed (v4.6, v4.5, v4.0) code` and select `w3wp.exe` in the `Available processes` list (`Show processes from all users` should be checked from the IIS process to display).

Debugging in Visual Studio 2017

This version of Visual Studio unfortunately doesn't have native container support, but debugging is still possible. The custom startup scripts we'll develop should help by providing guidance to the developer. Generally, though, the process looks like this once you've opened the Attach to Process window: 
  1. Connection type: Remote (no authentication)
  2. Connection target: XXX.XXX.XXX.XXX:4022
  3. Attach to: Managed (v4.6, v4.5, v4.0) code, Native code
  4. Click Refresh
  5. Available processes: w3wp.exe
  6. Click Attach


Developer Startup Scripts

One of our goals is to provide ease-of-use for our developers, who may not have in-depth Docker skillsets. In our 'Docker' folder, we can write some scripts to make the process of composing up and down straightforward.  

Also, we want to give the user a full picture of the SQL Server, Solr/site URLs, and debugging instruction/endpoints, so there aren't any gaps in utilizing the various containers once they are up and running.   

_Start-Environment.ps1

This script is what we run to start the environment.  There are four main sections to it.
  1. Back up the C:/Windows/System32/drivers/etc/hosts file. Since we rely on Rob's Host Writer to dynamically modify our local machine's hosts file, there is a risk that the file could unintentionally override the contents of the file (which has happened to me at least once).

  2. Execute the docker-compose up command in detached mode:

  3. Compile the website URLs, print the Visual Studio version with instructions for debugging, Solr URL, and SQL Server login information and credentials:

  4. Verify that the environment's status:


The full script:
When the script is executed from an admin PowerShell console, developers see the following output:

_Stop-Environment.ps1

This script is what we run to stop the environment.  There are four main sections to it.


_Pull-DockerImages.ps1

This script is intended to update images for the Docker environment if it has been updated in the remote container environment.


Bonus: DockerEnvironmentManager.ps1

This script takes the above three and wraps it in a pure PowerShell GUI 😋:


 

I started on this journey curious and skeptical as to whether working with a Docker environment in place of a traditional local Sitecore environment was a viable option in my line of work.  Countless hours of trial and error development have made me into a convert.  I've applied variants of the approaches outlined and have successfully Dockerized four client environments (8.2.1, 9.0.0, 9.0.1, and 9.2.0) and have been able to disable my original local settings. Two of these are in use now and accessible for use by developers.

Key Takeaways

  • Docker is happening. If you're not already using it, you will likely start using it in the upcoming years as it transitions into a formally-supported hosting solution by Sitecore.
  • Even if you're working with a solution that has no Docker history, you can customize everything and get it working. Keep tinkering. It's worth it.
  • Running multiple containers together all at once is resource-intensive. You'll need to make a decision on which Sitecore topology you select. Experience Management (XM) topology will be more lightweight, while XP will be a bit heavier. Keep it simple wherever you can. Expand and provide options for developers to use either or.
  • Once you've got a Docker environment, you may need to periodically update the images. These are roles and responsibilities that would be best clearly documented. 

If you have any questions - feel free to drop a line in the comments or reach out directly on Sitecore Slack

1 comment: