Showing posts with label azure. Show all posts
Showing posts with label azure. Show all posts

Monday, May 20, 2024

Sitecore PowerShell Extensions Text-to-Speech Audio Synthesis Module

Another year, another exciting Sitecore Hackathon!  This round, I flew solo under the moniker "Sitecorepunk 2077" (a play on the critically acclaimed 2020 action role-playing video game "Cyberpunk 2077").


If you're curious how the event unfolded, I documented my progress on X (formerly Twitter) every couple of hours:













Needless to say, I was utterly exhausted and slept for 12 hours straight, following the 32 hours I had been awake.  While I didn't snag a win (congrats, team Cloud Surfers and team 451 Unavailable For Legal Reasons ), I enjoyed the experience, am proud of what I was able to output, and look forward to the next one.


Module Concept and Inspiration

The 2024 Sitecore Hackathon category I chose to work against was "Best Module for XM/XP or XM Cloud" - although the result could also fit the bill for "Best use of AI".  Inspired by the ever-increasing need for accessible content, I decided to develop a module that converts text content into spoken audio files, which are then stored remotely and saved as an MP3 links within the item's context - all from within Sitecore. Ultimately, once I landed on the idea, the goal was to provide an easy-to-use tool for generating audio versions of Sitecore content, thereby enhancing accessibility and improving user engagement for individuals with visual impairments or preferences for audio content.

Features

Here’s a breakdown of what makes the SPE Text-to-Speech Audio Synthesis Module stand out:

Lifelike Speech Synthesis from Microsoft Azure Cognitive AI Speech Services

One of the core features of this module is its ability to convert text content into lifelike speech. By transforming text into life-like speech, the module makes content more accessible to a broader audience, including those with visual impairments and individuals who prefer consuming content through audio.

The module utilizes Microsoft Azure Cognitive Services Speech Service to generate audio from selected text fields dynamically. This integration ensures high-quality, natural-sounding speech output. Whether it's a blog post, news article, or product description, every piece of content can be converted into audio, broadening its reach and enhancing user engagement.

Storage via Azure Blob Storage

To store the generated audio files, the module leverages Azure Blob Storage APIs. Once an audio file is generated and store locally in a temporary directory, it is then uploaded to a dedicated Azure Storage container. The API returns a URL to the audio file, which is then populated in the context page item’s Audio URL field. 


Interface and Custom Ribbon Button

A custom Ribbon Button on the Home tab streamlines the audio-generating process. This button triggers an interactive Sitecore PowerShell Extensions dialog where authors can configure various options, such as voice selection, field selection, and speech rate adjustment, and kick off the speech synthesis generation.


The customizable options ensure the audio output matches the intended tone and speech rate, providing a tailored listening experience.

Multi-Language Support

Recognizing the diverse needs of global users, the module supports multiple languages. For demonstration purposes and within the natural time constraints of the Hackathon, the following languages are supported in the initial implementation:

  • English (en)
  • Japanese (ja-JP)
  • German (de-DE)
  • Danish (da)

Each supported language selection has a series of Neural (lifelike, natural-sounding) voice options from Microsoft Azure Cognitive Services Speech Service (~449 neural voices to choose from). These hand-selected voices are configured to provide the best audio experience for each language. Of course, support can be expanded to include additional languages (there are 136 languages supported by Azure AI Speech Services).  


High-level Technical Breakdown

Initialization and Setup

The script sets up the necessary Azure services and local environment configurations.


User Interaction and Dialog Configuration

The script provides a dynamic interface through a custom Ribbon Button in the Sitecore Content Editor. This button, titled 'Generate Audio' or 'Regenerate Audio' based on the context item’s state, opens a dialog for configuring the audio output.  The fields and options available in the dialog are as follows:

- Field to Convert to Speech
  • Lists all Rich Text Editor (RTE) and multi-line text fields available on the item.
  • Special Case: If the 'Speech Content Override' field is populated, it appears as an additional option.

- Include Title?
  • A standalone radio button to include the item's title in the audio file.

- Voice
  • Dynamic option based on the item's language, the dialog offers preselected AI Neural voices.

- Speech Rate
  • Control the how fast the speech is spoken. 
    • Optional double value, defaulting to 1.0 if left empty.
    • Range: Between 0.5 (slow) and 2.0 (fast).

The dialog properties and user input handling are defined as follows:


Fetching and Sanitizing Text Content

The Invoke-AudioStreamFetch function handles the core functionality of fetching the text content from Sitecore, sanitizing it, and preparing it for conversion into speech.

The function checks if the title should be included and concatenates it with the main text content. It then sanitizes the text by removing HTML tags and special characters, ensuring clean input for the TTS service.


Sending Text to Azure AI for Speech Synthesis

As seen above, the sanitized text is then sent to the speech service endpoint for conversion into an audio file. The response, which contains the audio stream, is saved locally.


Uploading the Audio File to Azure Blob Storage

Once the audio file is generated, it is uploaded to Azure Blob Storage by calling the Upload-FileToAzureStorage function. This function handles the Azure Storage REST API authentication and the file upload process.


Updating Sitecore Item with Audio URL

After uploading the audio file to Azure, the script updates the Sitecore item with the URL of the audio file, ensuring that the content authors can easily access and manage the generated audio files.


Utilizing the Audio File on the Front-end

Once an item's Audio URL field has been populated, it can be used on the front-end within an HTML audio tag:




This is the simplest approach for playing the audio file, but further styling customizations are doable.


Video Demo

Part of the Hackathon Entry includes a video demo. You can check it out below:


Final Thoughts

Participating in the Sitecore Hackathon has always been an exhilarating experience for me, given the time crunch and competitiveness of the community. That night, the development of the SPE Text-to-Speech Audio Synthesis Module pushed my organizational and technical boundaries, and I'm proud of what I could accomplish in such a short timeframe. More importantly, I hope the resulting module helps highlight the importance of accessibility in content management and end-user experiences. 

If you're interested in or inspired to build your own Text-to-Speech synthesis module, the full PowerShell script and documentation are available on Github.

Wednesday, February 14, 2024

Sitecore ADM: Resolving Stalled Tasks and Restoring Task Processing

My team is currently in the process of purging millions of historical anonymous xDB contact records and associated data using the ADM module for a client whose xDB shard database sizes have been approaching max storage capacity for the Azure tier.  Because xDB is a crucial portion of the client site's operations, our options for reducing the DB size have been somewhat limiting due to complex custom external integrations with xDB. 

In our approach, we opted to use ADM to purge historical anonymous contacts in batches. We prepare ~300k contact records per shard for each batch, which are manually retrieved via SQL query. Once we've created the temporary table in the shard DB, we prepare the data by generating a comma-delimited list of contacts and then kick off the purge process via ADM. 

When ADM populates its Tasks table, each queued record is subsequently processed by ADM and removed from the Tasks table as it completes processing that record.  The ADM task execution is a generally slow process (1 contact processed every 2-3 seconds); we closely monitor the progress with a SQL query:



With this approach (in addition to SHRINK and REINDEX operations between batches), we have seen the necessary disk size reduction of both xDB shard DBs after running a cadence of several batches.  

However, we ran into a snag in a recent batch, which resulted in the entire ADM task processing halting entirely.  The issue appeared to directly correlate with general Azure Maintenance operations, which had occurred over the weekend while the batch was mid-process.  Azure Maintenance updates typically happen without any advanced notice or warning.  Usually, Azure Maintenance operations have minimal adverse effects, but this round seemed to have caused much of the infrastructure to spiral.  We observed that the ADM tasks were no longer processing when all was said and done.  

Attempts to re-start the job via ADM kept resulting in the same error:

"[ADM] Response from xConnect did not indicate success. Status code: BadRequest, Message: {\"Message\":\"The remove task can't be started while another one is running.\"}"

Upon initial analysis, we noted that the ADM tasks table was still populated with IDs that had yet to be processed when the operation was cut off. I began dissecting the ADM binary files for clues - specifically in search of the message "The remove task can't be started while another one is running".  

I learned that the StartContactsDataRemoving method queries an IsRunning method to determine if any other tasks are in progress. If there are, it throws a BadRequest
response and returns the "The remove task can't be started while another one is running.
message" message. 



Digging deeper led me to this ClearRemoveDataSettings method - called in the StopRunningTasksAndClearStorage method.  Deeper in, there are references to a PropertiesRepository class and an object name of "RemoveDataSettings" used to store task information:


This, in turn, finally led me to a PropertyValueQuery method in a PropertiesRepositoryQueries class, which contained a SQL command used as part of the process:




We reviewed the current state of the ADM Properties table within the ADM DB and found three entries, including RemoveDataSettings:


The RemoveDataSettings record's value appeared to be a JSON representation of ADM's last ADM removal task run.  However, the JSON representation was cut off after a few hundred characters.  With this state of the present value, ADM was convinced that the task was not completed.  

Following the approach used in the code (mimicking what should occur when an ADM removal task is completed), we ran the following command:

We also entirely cleared the remaining IDs and Tasks table and re-initialized the process.  With these steps, our ADM tasks were back to processing as expected.

I hope this one helps anyone in a similar situation!


Monday, May 9, 2022

Latest Azure PaaS Sitecore Logs using a single line of PowerShell

If you’re anything like me, you probably don’t have a passion for manually digging through the series of hundreds of randomly dated folders that look like this in search of the latest Sitecore logs:


Hello darkness, my old friend

Although several tools and approaches are available (including this nifty tool credited to fellow Sitecore MVP Kiran Patil - as well as some of my own previous posts from 2018 and 2019 covering this topic), I've recently adopted a different strategy that's proved to be successful across several Sitecore PaaS clients for quickly obtaining the latest physical Sitecore log for a given server. 

The post-worthy kicker?  It's one line of PowerShell:

$kuduHost = "https://yourazuresitename-xp2-cd.scm.azurewebsites.net"; Write-Output "`n[ LATEST SITECORE LOGS ]`n"; $array = @(); Get-ChildItem "C:\home\site\wwwroot\app_data\logs" -File -Recurse | Where-Object { $_.FullName -match "azure.*.txt" -and $_.LastWriteTime -gt (Get-Date).AddHours(-12) } | ForEach-Object { $path = $_.FullName.replace("C:\home\site\wwwroot\app_data\logs\", "$kuduHost/api/vfs/site/wwwroot/App_Data/logs/"); $array += "`n[$($_.LastWriteTime)]`n$path`n"}; $array | Sort-Object $_.LastWriteTime | Select-Object -Last 3

😬

Okay, it's...kind of a long one-liner...but one line nevertheless 

The above example outputs direct links to the latest three physical Sitecore log files, which match the pattern 'azure.*.

In practice, the desired file can be highlighted from the console and at which point you can copy the URL or open it in a new tab.



Let's break it down

The first line defines a variable for the KUDU host you're using:

$kuduHost = "https://yourazuresitename-xp2-cd.scm.azurewebsites.net"

The second line outputs a (wholly arbitrary and unnecessary) title:

Write-Output "`n`[ LATEST SITECORE LOGS ]`n"

The third line represents an array variable aptly named `$array` (because I'm clever):

$array = @();

This is where it gets exciting. This Get-ChildItem cmdlet gets all files recursively under the site's `\App_Data\logs` location:

Get-ChildItem "C:\home\site\wwwroot\app_data\logs" -File -Recurse
Neat!

We can pipe in a Where-Object cmdlet to filter only file names that match 'azure.*.txt' (or if you want all log types - Publishing, Crawling, Dianoga, SPE, etc. - *.txt) and provide a 12-hour threshold against the `LastWriteTime` property:

Get-ChildItem "C:\home\site\wwwroot\app_data\logs" -File -Recurse |
Where-Object {$_.FullName -match "azure.*.txt" -and $_.LastWriteTime -gt (Get-Date).AddHours(-12)}

We can then pipe in a ForEach-Object cmdlet to iterate through each file:

Get-ChildItem "C:\home\site\wwwroot\app_data\logs" -File -Recurse | Where-Object { $_.FullName -match "azure.*.txt" -and $_.LastWriteTime -gt (Get-Date).AddHours(-12) } | ForEach-Object { $path = $_.FullName.replace("C:\home\site\wwwroot\app_data\logs\", "$kuduHost/api/vfs/site/wwwroot/App_Data/logs/"); $array += "[$($_.LastWriteTime)]`n$path`n"}

Notice that in the ForEach-Object cmdlet, we create a variable called `$path` and set it to a string that takes the file's FullName and replaces the 'system path' portion, and replace it with our `$kuduHost` variable concatenated to `/api/vfs/site/wwwroot/App_Data/logs/`.

$path = $_.FullName.replace("C:\home\site\wwwroot\app_data\logs\", "$kuduHost/api/vfs/site/wwwroot/App_Data/logs/")

Without this string replacement, we'd only get the system path for the files in the dataset, which would still require navigating to the file manually:

Also, within the ForEach-Object cmdlet, a formatted string containing the LastWriteTime and the `$path` variable is added to the `$array` variable:

$array += "[$($_.LastWriteTime)]`n$path`n"

The `n used above allows for line breaks.

After the files have been processed, the `$array` variable is called and sorted by LastWriteTime.

A Select-Object cmdlet is piped in to limit the number of results to 3:

$array | Sort-Object $_.LastWriteTime | Select-Object -Last 3


By combining these together, eliminating spaces, and adding semi-colons to separate commands, we've got our one-liner! πŸ•Ί

Bonus: IIS HTTP Request Logs

Using the same approach with a few modifications, the application's raw IIS HTTP Request Logs can also be obtained (differences bolded below):

$kuduHost = "https://yourazuresitename-xp2-cm.scm.azurewebsites.net"; Write-Output "`n[ LATEST IIS LOGS ]`n"; $array = @(); Get-ChildItem "C:\home\LogFiles\http\RawLogs" -Recurse | Where-Object { $_.FullName -match ".log" -and $_.LastWriteTime -gt (Get-Date).AddHours(-12) } | Sort-Object $_.LastWriteTime | ForEach-Object { $path = $_.FullName.replace("C:\home\LogFiles\http\RawLogs\", "$kuduHost/api/vfs/LogFiles/http/RawLogs/"); $array += "[$($_.LastWriteTime)]`n$path`n"}; $array | Sort-Object $_.LastWriteTime | Select-Object -Last 3


Final Thoughts

You can generate variations of this one-liner by changing the various variables, which can be shared with the rest of your development/troubleshooting team and readily ready to copy from an internal Wiki:

Feel free to use and modify the script as you see fit. πŸš€

Thursday, April 8, 2021

Sentiment Analysis and Keyword Extraction using Sitecore PowerShell and Microsoft Cognitive Text Analytics

Sitecore Hackathon 2021

Well...wow, it actually happened...

I managed to snag a category win for the 2021 Sitecore Hackathon! πŸ˜…


This year, I unexpectedly flew solo as my team members could not attend (both due to completely understandable reasons).  Luckily for me, one of this year's categories, in particular, made me feel like I stood a chance: "Best use of Sitecore PowerShell Extensions to help Content Authors and Marketers."

YES. YES YES 1000x YES. 

Knowing that I needed to land on something fairly quickly to complete all submission requirements (a completed module with clean code, reliable installation instructions, a well-documented README.md, and a video) my evening began with a brainstorming session listing all possible routes I could take for the next 24 hours.  

I actually landed on a similar concept I posted about a couple of years back; interacting with Microsoft's Cognitive Services using PowerShell, then focusing on content translation. I knew Microsoft had continued to update their API offerings since that post, so I started digging into what was new.  I stumbled upon the Sentiment Analytics API, which seemed like an excellent use case that could satisfy the 'help Content Authors and marketers' category requirement.  

By providing the right combination of SPE user interactivity (modal dialogs, accessibility of the utility in the Ribbon, etc.), I could build a utility that analyzes content from a given field and provide a sentence-by-sentence breakdown of the content's sentiment score using AI.

After playing around with the example APIs in the browser, I decided to create my Text Analytics Cognitive Service in Azure, grab my API keys, and fiddle around with the API further in PostMan.  At that point, I felt pretty confident that I could integrate this with SPE. 🀞

The Sentiment Analyzer would

  • Analyze the sentiment of field content directly in Sitecore.

  • Give Content Authors the ability to run an analysis of a given field's content, which returns an overall sentiment score and a sentence-by-sentence breakdown of each sentence's sentiment score and corresponding confidence scores.

  • The results are displayed using a Show-Result modal and rendered in an easy-to-digest format.

I built the user dialog, wrote code that generated the appropriate POST data to be passed to the sentiment API endpoint, built the functions to render the data (using emojis, of course πŸ‘©‍πŸš€), configured a new Sitecore template and the corresponding item for API key storage then tied it all together into an SPE module that exposed the tool from the right-clicked Context Menu, and from the Ribbon.

As midnight approached, I felt that I was in decent enough shape with the Sentiment Analysis script, I could begin exploring using another API in the same Text Analytics product group. I moved forward with a second tool utilizing the API's keyphrase extraction feature without a tremendous amount of overhead; mostly endpoint changes, JSON parsing, and data rendering differences. 

The Keyword Analyzer would:

  • Analyze a field's content to extract critical keywords/phrases.

  • Give Content Authors the ability to analyze a given field's content which returns a list of extracted keywords that can then be used to manually populate a meta keywords field.

  • The results are displayed using a Show-Result modal and rendered in an easy-to-copy format. 


I got started, but a couple hours later...


Then a few hours later...

I spent most of the day (alongside juggling sick-kids priorities) polishing the scripts I had so far; resolving logic issues, error prevention, adding code comments, and overall meticulous code clean-up.

Eventually, I had a functional set of utilities. 

Buttons in the Ribbon configured in the SPE module.


Dialog when clicking either utility against an item
with a Single-Line, Multi-Line, or Rich Text field. 

Sample output of sentiment analysis

Sample output of keyword analysis

I made sure to stop by for a late morning Coffee Break. ☕


I built the final structure of the SPE module using the Module Wizard πŸ§™‍♂️ to configure my integration points.  The module also stores the API Settings item, so swapping in an API key would be seamless for anyone who installs the module.  


⚡ The module looked like this in the tree:




I spent the final hours of the event packaging the module/testing the installation steps before working on multiple documentation phases (using Markdown for absolutely everything in 2020 was really coming in handy).

It wasn't long before a mid-afternoon Twitter update:



The video production was probably one of the most challenging parts of this experience.  After writing a short-handed verbal script, I tried to record the entire demo in a single recording.  I used OBS Studio to record and the built-in Video Editor in Windows for post-production.  I even squeezed some personal music snippets I composed some time ago without risking Copywrite strikes on YouTube. πŸ˜‚

The video submission can be viewed here:

By around 5 PM, I was done and had submitted my entry πŸš€

The full Github submission can be found here, including the full source code for both scripts, the module ZIP for installation, and installation steps. 

Take it for a spin if you care to! 🀹‍♂️

I'm really humbled and proud to have been a part of the winner's circle this year.  Another big shout-out to the folks who run and judge the event, as well as a big congratulations to the other category winners!

Check out the complete 2021 Sitecore Hackathon winners announcement here: https://www.youtube.com/watch?v=YEOy7lIDZUU

I'm already looking forward to next year. πŸ“†


Thursday, May 21, 2020

Part II - Integrating Automated Reverse Azure Database Migration PowerShell Script into Azure DevOps


In my last post, we wrote a handy PowerShell script that takes the latest Master and Web SQL Databases from a Production-level Azure Resource Group and imports them into a Staging/UAT/Dev Azure Resource Group for a seamless reverse database promotion process.  

The original script, however, relies on a developer to run the script manually on a local machine and authenticate their credentials in order to utilize the AzureRm commands:

We can take this script a step further and integrate it as a new stage in the existing Azure DevOps Release Pipeline, or as a new dedicated Release Pipeline that can be executed independently.

In this example, we will create a new Azure DevOps Release Pipeline.  We'll assume a Service Principle connection already exists (which is likely if you're deploying to your App Services using Azure DevOps already) and you have the proper administrator permissions to create pipelines in Azure DevOps.   We'll also be working with an Inline Azure PowerShell script job instead of including a script file from an artifact.  Steps will slightly differ if you want to go that route, but the concept would remain the same. 

Release Pipeline Setup


Head over to the Pipelines > Release dashboard, click the New dropdown and select New release pipeline.


In the 'Select a template' menu, click 'Empty job'.

Modify the Pipeline name, then click on Stage 1 and click the plus sign on Agent job to add a new agent.  Search for 'powershell', find Azure PowerShell task and click the Add button


Set the Azure Subscription to the appropriate service principle, set the Script Type to Inline Script, and set the Azure PowerShell Version to Latest installed version


Save the pipeline and navigate to the Variables section

Variable Setup

Here, we'll add all the variables that we'll consume in the script - allowing for future modification without touching the script code itself.  

In our case, our script calls for the following variables: 
  • - sourceResourceGroupName
  • - sourceSqlServerName
  • - sourceMasterDbName
  • - sourceWebDbName

  • - targetResourceGroupName
  • - targetSqlServerName
  • - targetSqlServerAdminUserName
  • - targetSqlServerAdminUserPassword
  • - targetMasterDbName
  • - targetMasterSqlUserPassword
  • - targetWebDbName
  • - targetWebSqlUserPassword
  • - targetCdServerName
  • - targetCmServerName


Script Modifications


Luckily, our original script doesn't need too much tinkering! Just a bit πŸ˜‰ 

First, we'll want to remove the Login-AzureRmAccount command altogether since the Azure PowerShell task in the pipeline will authenticate off of the service principle.
 
We'll then replace any hardcoded variables with their new corresponding variables we previously configured throughout the script using the $env:someVariableName format:

We'll finish this off by placing the modified script in the Inline Script field of our Azure PowerShell task.




Tuesday, April 21, 2020

Automate Reverse Azure Database Migrations using PowerShell



Working with Production-level content in lower environments (eg. DEV or UAT) is important for ongoing development and testing.  Depending on your item serialization/source control approach, keeping content in sync can be a challenge.

Using Unicorn or TDS for templates and layouts is common, but source-controlling all content (specifically media items) can bring a lot of weight to the project.  In lieu of utilizing serialization technologies or a synchronization tool such as Razl to synchronize content (which I've seen take hours to complete depending on the content load), a common approach is to periodically restore the Master/Web Databases from a Production environment down to lower environments.

In an Azure PaaS setup, without any automation or scripting, this manual process may look like this:

1) Log in to Azure Portal

2) Navigate to the source (production) SQL server instance's Master/Web database

3) Click the copy button and set up the database copy operation configuration (target database name, target server, and pricing tier)

5) Execute the copy operation and wait for the copied database to become available.

6) Log into the target SQL Server instance using SQL Server Management Studio or use the  SQL Database Query Editor built into Azure Portal, and execute an ALTER USER query to reset the login password to match the original database passwords

7) Rename the currently connected Master/Web database to include a suffix in the name (eg. _OLD)

8) Rename the copied Master/Web database to use the original Master/Web database name

9) Restart the server

Obviously, this process can vary and is generally a tedious and time-consuming process.

Luckily, Azure resources can be managed using the suite of PowerShell commands without ever needing to access the UI.  With the right script, the strain of manually executing these steps can be alleviated.

To use these commands, the Azure PowerShell Module must be installed.

For our scenario, let's assume the following:
1) PROD environment is in a separate Resource Group than the NON-PROD environment

2) While the Master database should suffice, we'll also copy down the Web database to avoid requiring a publishing operation after the script has completed.

3) ConnectionString.config value should not require modification.

4) A short "outage" of the NON-PROD environment will occur during the process since the connected database will be renamed to make room for the copy.

Let's Script It

Step 1 - Define Target and Source Variables

We need to define our target and source variables including source/target Resource Group Names, SQL server names, database names, and NON-PROD environment SQL Admin Credentials.


Step 2 -  Invoke Azure Login Process

This command will invoke the login process to a specific subscription ID. The user will be prompted to log in.


Step 3 -  Rename the currently connected database to make room for the copied database

Since the name of the database on the NON-PROD environment should remain constant, this command will rename the existing NON-PROD database to include a unique dated suffix. Note that this database will not be removed automatically and can be used as a backup in the case that the NON-PROD environment contained content that was not accounted for or backed-up prior to the migration.  Removing it will be up to you.


Step 4 -  Initialize the database copy operation

Once the name of the database is available on the target SQL server, the following command will execute the database copy process.

Step 5 - Execute the ALTER USER query

Since the database login from the source database comes with the process of a direct copy, an ALTER LOGIN query must be executed against the database to reset the [masteruser] or [webuser] passwords to match what's in the NON-PROD ConnectionStrings.config.


Step 6 - Restart the App Service

When the copy operation is completed, restarting both App Services will ensure a fresh connection to the databases is established.


Final Script




Thursday, September 5, 2019

Azure Application Insights: Logs & Requests Viewer using Sitecore PowerShell Extensions

Last September, I wrote about accessing Sitecore Logs from Azure PaaS instances using the (now deprecated) AzureAILogs.html file provided by Sitecore. The knowledgebase article was updated in mid-January , 2019 – and the AzureAILogs.html file had been replaced with a new /sitecore/admin page dubbed AzureTools.aspx.

This updated admin page contains all the same functionality found in the AzureAILogs.html, with the addition of being able to pull log traces and requests from Application Insights.



Installation is easy: download the AzureTools.zip files, drop in the /sitecore/admin/AzureAILogs.aspx into the you’re your site’s root.

Admittedly, this admin page is great - but I could also see several aspects SPE being particularly useful (like the OOB SPE ListView - which would easily allow us to filter/sort/search through a series of log entries). An additional option to see raw color-coded logs would also be cool. 😊

Using the existing AzureTools.aspx as a general guide, we can re-create the GUI with general ease.

We’ll need:
1) Option to get Requests or Logs
2) Option to selected a Role (values pulled from API)
3) Option to control recency.
4) Option to control the severity.


The end result will consume the Application Insights REST API endpoints and allow a user to pull logs from Application Insights inside the CMS.


API Access

To start, we'll need to make sure we can work with the API by obtaining an Application Insights App ID and a corresponding App Insights API key  Sitecore's documentation already lists the.

Sitecore's documentation covers this but it's as simple as logging into Azure Portal and navigating to your Application Insights service. 

Under Configure, select 'API Access':


The Application Insights App ID will be displayed the following screen:

Copy this value and store it temporarily.

Click the 'Create API Key' button.
Give it a name and check the 'Read telemetry' checkbox.

After clicking 'Generate key', you'll have one opportunity to copy the 'App Insights API key'.  Copy and store this value temporarily.


Initial Communication with the API

Our script will utilize the two values to interact with the API.

Before building our UI in SPE, we'll need to confirm API communication by obtaining the server roles from Application Insights.  We can set a variable to call a function that will grab an ArrayList of roles:

Our function will build the URL, include the property URL authorization header containing the API key, and return an array list.



User Interface

Now that we have confirmed communication to the API and obtained our list of roles, we can pass the variable into a new function that will be responsible for building and displaying the UI:

The dialog should contain a series of radio buttons and checkbox lists, all of which will be used to provide options to build out another API call to obtain the traces or requests from AppInsights.

The output displays as follows:

Notice line 51 in the above snippet calls a Get-LogsOrRequests function which accepts a series of parameters from the dialog options upon selecting the OK button.

This function builds out the proper API URL and query parameters based on the passed the selected values passed in. Invoke-WebRequest is used to make the call to the API, which will return a JSON object of log entries from AppInsights based on those parameters.

Line 119 contains a final call to a function called 'Set-PostDialog' which provides options for displaying the results.

The output here is a ModalDialog with three buttons:


Selecting 'Script View' will display the results of the API in a color-coded Show-Result window:


Selecting 'List View' will process the results to an acceptable format for a standard SPE ListView result window (filtering, exporting, etc is obviously all included here):


Finally, selecting the 'Download' button will download a .txt file of the contents retrieved from the API.


Final Script





Installation

Manual

  1. Create a new Sitecore item based on the SPE PowerShell Script template and copy the final script above into the Script Body field.
  2. Replace the default "XXXXXXXXXXXXXXXXXXXXXXXXX" placeholder values in the $aiAppID and $apiKey variables with your own.  

Sitecore Package

  1. Download the Sitecore package and install from GitHub.
  2. Navigate to the PowerShell script located here:
    /sitecore/system/Modules/PowerShell/Script Library/Azure Application Insights Logs/Toolbox/Azure Application Insights Logs
  3. Replace the default "XXXXXXXXXXXXXXXXXXXXXXXXX" placeholder values in the $aiAppID and $apiKey variables with your own.  
The script will be available to run from the PowerShell Toolbox in the Start Menu.


Source Code

The full script can also be found on GitHub.
Feel free to grab a copy and modify it how you see fit.  



Thursday, July 11, 2019

Azure Search 1,000 Field Limit: Generating Values for AddIncludedField using PowerShell

If Azure Search is set up as your search provider, you're probably already aware of the limitations that come with it. One common issue you may discover (even well after your initial launch) is the limit on the number of fields Azure Search allows - a maximum of 1,000 fields.

In our case, after a major feature implementation and the introduction of a new site into our solution, the following log errors surfaced during the index rebuild process:

Exception: Sitecore.ContentSearch.Azure.Http.Exceptions.AzureSearchServiceRESTCallException Message: {"error":{"code":"","message":"The request is invalid. Details: definition : Invalid index: The index contains 1001 leaf fields (fields of a non-complex type). An index can have at most 1000 leaf fields.\r\n"}} 
The rebuild would get stuck and not budge.  This has been documented.

In our case, the <indexAllFields> property was set to true (which is the default value) causing the index field count to exceed the limit.

For the web and master index, there are two ways to work around this limitation within the Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config or overriding patch file:
  1. Set <indexAllFields> to false - then includethe fields which are necessary in the <include hint="list:AddIncludedField"> section.
  2.  Set <indexAllFields> to true and then exclude the fields which are unnecessary in the <exclude hint="list:AddExcludedField"> section. 

I like the first option.

We initially attempted to manually create all fields we believed were needed for our custom search services. This turned out to be rather tedious - and left a lot of room for error had we missed a field.

To quickly and easily identify all custom fields to include in the AddIncludedField section, I put together this super simple Sitecore PowerShell Extensions snippet:

Notice the path to the User Defined folder.  If you run this to include all custom fields with the query above as-is, you may run into the same 1,000 field limit error.  You can simply change the path and run this multiple times (or build a script out of it) to generate sections in bulk for templates that make sense for your specific solution.

The output can be simply copied and pasted into the AddIncludedField section of the Sitecore.ContentSearch.Azure.DefaultIndexConfiguration.config file:



The index should be able to rebuild successfully with all the custom fields present.
If you need to trim the index further, you should be able to determine your exclusions with a bit more granularity knowing that you haven't missed anything.

Happy Indexing! πŸš€


Wednesday, June 12, 2019

Azure App Service Deployment Error: There is not enough space on the disk.

I recently came across an error during a deployment to one of our Content Delivery App Service instances - stopping the deployment process dead in its tracks. 

The error occurred when deploying to the CD Slot:


This occurred each time we attempted to deploy this step.

The error in full does indicate troubleshooting codes and links (oddly enough, the Microsoft link did not have any trace of the error code):
Failed to deploy web package to App Service. Error Code: ERROR_NOT_ENOUGH_DISK_SPACE More Information: Web Deploy detected insufficient space on disk. Learn more at: http://go.microsoft.com/fwlink/?LinkId=221672#ERROR_NOT_ENOUGH_DISK_SPACE. Error: The error code was 0x80070070.

I found it especially strange given that we saw no indication that we've hit any limit on storage space in Azure Portal.  Initially, I thought it may have something to do with storage on the build server, but was able to quickly rule out this theory by verifying that there was plenty of disk space remaining on that machine.

The error was also evident when I attempted to upload any file directly via FTP:

I began a look for some giant file(s) that may be preventing any additional files from being uploaded (since that's what made the most sense given the context of the error message itself).

By logging into the App Service via FTP, I was able to identify two IIS memory dumps located in the /LogFiles directory.  These both appear to have been taken the month prior - and simply never removed. 

After deleting both memory dump directories, I was able to restart the deployment step - which completed without errors. Direct FTP uploads were also restored.

What's up with that?

Well, your Azure App Services is tied to a particular pricing tier that dictates storage, memory, ACU, etc.  In our case, this particular App Services is configured to use the Standard standard tier - which has a 50GB limit.




By leaving the remnants of these IIS memory dumps on the App Service's storage, we must have surpassed that storage limit.

This brings up some interesting questions:

First, based on the configured pricing tier, is there a limit on how much Sitecore can store in App_Data/MediaCache or /temp folders before hitting that cap?

I assume the answer to that is yes - if your application is not setup to periodically clean stale files (which it should by default), it's possible to reach that storage limit and cause this error to surface.  In that case, the quick fix to get your deployment out would be to remove some or all temp files in the application.

Second, how can we monitor this storage limit of App Services in Azure Portal?

I'll have to circle back on this one as I don't quite have the answer to it.  It may even already exist, and I just haven't spotted it yet.   I'll update this post if I do figure that one out, but please let me know if you have the answer in the comments!