Monday, May 20, 2024

Sitecore PowerShell Extensions Text-to-Speech Audio Synthesis Module

Another year, another exciting Sitecore Hackathon!  This round, I flew solo under the moniker "Sitecorepunk 2077" (a play on the critically acclaimed 2020 action role-playing video game "Cyberpunk 2077").

If you're curious how the event unfolded, I documented my progress on X (formerly Twitter) every couple of hours:

Needless to say, I was utterly exhausted and slept for 12 hours straight, following the 32 hours I had been awake.  While I didn't snag a win (congrats, team Cloud Surfers and team 451 Unavailable For Legal Reasons ), I enjoyed the experience, am proud of what I was able to output, and look forward to the next one.

Module Concept and Inspiration

The 2024 Sitecore Hackathon category I chose to work against was "Best Module for XM/XP or XM Cloud" - although the result could also fit the bill for "Best use of AI".  Inspired by the ever-increasing need for accessible content, I decided to develop a module that converts text content into spoken audio files, which are then stored remotely and saved as an MP3 links within the item's context - all from within Sitecore. Ultimately, once I landed on the idea, the goal was to provide an easy-to-use tool for generating audio versions of Sitecore content, thereby enhancing accessibility and improving user engagement for individuals with visual impairments or preferences for audio content.


Here’s a breakdown of what makes the SPE Text-to-Speech Audio Synthesis Module stand out:

Lifelike Speech Synthesis from Microsoft Azure Cognitive AI Speech Services

One of the core features of this module is its ability to convert text content into lifelike speech. By transforming text into life-like speech, the module makes content more accessible to a broader audience, including those with visual impairments and individuals who prefer consuming content through audio.

The module utilizes Microsoft Azure Cognitive Services Speech Service to generate audio from selected text fields dynamically. This integration ensures high-quality, natural-sounding speech output. Whether it's a blog post, news article, or product description, every piece of content can be converted into audio, broadening its reach and enhancing user engagement.

Storage via Azure Blob Storage

To store the generated audio files, the module leverages Azure Blob Storage APIs. Once an audio file is generated and store locally in a temporary directory, it is then uploaded to a dedicated Azure Storage container. The API returns a URL to the audio file, which is then populated in the context page item’s Audio URL field. 

Interface and Custom Ribbon Button

A custom Ribbon Button on the Home tab streamlines the audio-generating process. This button triggers an interactive Sitecore PowerShell Extensions dialog where authors can configure various options, such as voice selection, field selection, and speech rate adjustment, and kick off the speech synthesis generation.

The customizable options ensure the audio output matches the intended tone and speech rate, providing a tailored listening experience.

Multi-Language Support

Recognizing the diverse needs of global users, the module supports multiple languages. For demonstration purposes and within the natural time constraints of the Hackathon, the following languages are supported in the initial implementation:

  • English (en)
  • Japanese (ja-JP)
  • German (de-DE)
  • Danish (da)

Each supported language selection has a series of Neural (lifelike, natural-sounding) voice options from Microsoft Azure Cognitive Services Speech Service (~449 neural voices to choose from). These hand-selected voices are configured to provide the best audio experience for each language. Of course, support can be expanded to include additional languages (there are 136 languages supported by Azure AI Speech Services).  

High-level Technical Breakdown

Initialization and Setup

The script sets up the necessary Azure services and local environment configurations.

User Interaction and Dialog Configuration

The script provides a dynamic interface through a custom Ribbon Button in the Sitecore Content Editor. This button, titled 'Generate Audio' or 'Regenerate Audio' based on the context item’s state, opens a dialog for configuring the audio output.  The fields and options available in the dialog are as follows:

- Field to Convert to Speech
  • Lists all Rich Text Editor (RTE) and multi-line text fields available on the item.
  • Special Case: If the 'Speech Content Override' field is populated, it appears as an additional option.

- Include Title?
  • A standalone radio button to include the item's title in the audio file.

- Voice
  • Dynamic option based on the item's language, the dialog offers preselected AI Neural voices.

- Speech Rate
  • Control the how fast the speech is spoken. 
    • Optional double value, defaulting to 1.0 if left empty.
    • Range: Between 0.5 (slow) and 2.0 (fast).

The dialog properties and user input handling are defined as follows:

Fetching and Sanitizing Text Content

The Invoke-AudioStreamFetch function handles the core functionality of fetching the text content from Sitecore, sanitizing it, and preparing it for conversion into speech.

The function checks if the title should be included and concatenates it with the main text content. It then sanitizes the text by removing HTML tags and special characters, ensuring clean input for the TTS service.

Sending Text to Azure AI for Speech Synthesis

As seen above, the sanitized text is then sent to the speech service endpoint for conversion into an audio file. The response, which contains the audio stream, is saved locally.

Uploading the Audio File to Azure Blob Storage

Once the audio file is generated, it is uploaded to Azure Blob Storage by calling the Upload-FileToAzureStorage function. This function handles the Azure Storage REST API authentication and the file upload process.

Updating Sitecore Item with Audio URL

After uploading the audio file to Azure, the script updates the Sitecore item with the URL of the audio file, ensuring that the content authors can easily access and manage the generated audio files.

Utilizing the Audio File on the Front-end

Once an item's Audio URL field has been populated, it can be used on the front-end within an HTML audio tag:

This is the simplest approach for playing the audio file, but further styling customizations are doable.

Video Demo

Part of the Hackathon Entry includes a video demo. You can check it out below:

Final Thoughts

Participating in the Sitecore Hackathon has always been an exhilarating experience for me, given the time crunch and competitiveness of the community. That night, the development of the SPE Text-to-Speech Audio Synthesis Module pushed my organizational and technical boundaries, and I'm proud of what I could accomplish in such a short timeframe. More importantly, I hope the resulting module helps highlight the importance of accessibility in content management and end-user experiences. 

If you're interested in or inspired to build your own Text-to-Speech synthesis module, the full PowerShell script and documentation are available on Github.

Tuesday, April 9, 2024

Sitecore XM Cloud Developer Certification Practice Exams: A Free Study Companion

Certification is a crucial milestone for any developer pursuing excellence and proficiency in Sitecore XM Cloud.  One of my preferred ways to learn and study is via practice exams.  However, with existing spread of Sitecore XM Cloud practice exams available online cost being between $30 and $150, the financial burden of personally preparing can be as daunting as the exam itself. 

That's why I'm excited to introduce the Sitecore XM Cloud Developer Certification Practice Exams app, a completely free resource designed to democratize the preparation process for all Sitecore developers.

Elevating Your Exam Readiness Without the Cost

The XM Cloud Certification demands a deep understanding of numerous Sitecore aspects, from XM Cloud architecture and developer workflow to security and data modeling. This exhaustive list requires serious preparation. The Sitecore XM Cloud Developer Certification Practice Exams app offer a thorough, cost-free study tool that reflects the actual exam's breadth and depth.

Tailored for Comprehensive Preparation

  • Precise Exam Simulation: The practice exams simulate the actual test with 50 questions chosen randomly, testing not just your knowledge but also your ability to perform under exam conditions.

  • Competency-Centric Learning: Dive into crucial competencies on which the exam will test you. Each practice question is sourced from Sitecore's documentation and is an opportunity to fortify your understanding of core Sitecore XM Cloud competencies.

  • Real Exam Experience: Sharpen your time management skills with a 100-minute timer that mirrors the exam's duration.

Commitment to Community and Accessibility

Access to educational resources should be barrier-free in a landscape dotted with expensive prep materials.  The Sitecore XM Cloud Developer Certification Practice Exams app was born from a blend of personal needs and a commitment to the Sitecore community. This practice exam tool is my contribution towards leveling the playing field for all aspiring XM Cloud certified developers.

I'm excited to offer this resource to the community, ensuring that everyone has the chance to study effectively and become certified without the financial strain. Start your free practice runs today and please share this tool with anyone who might benefit.

Happy learning!

Wednesday, February 14, 2024

Sitecore ADM: Resolving Stalled Tasks and Restoring Task Processing

My team is currently in the process of purging millions of historical anonymous xDB contact records and associated data using the ADM module for a client whose xDB shard database sizes have been approaching max storage capacity for the Azure tier.  Because xDB is a crucial portion of the client site's operations, our options for reducing the DB size have been somewhat limiting due to complex custom external integrations with xDB. 

In our approach, we opted to use ADM to purge historical anonymous contacts in batches. We prepare ~300k contact records per shard for each batch, which are manually retrieved via SQL query. Once we've created the temporary table in the shard DB, we prepare the data by generating a comma-delimited list of contacts and then kick off the purge process via ADM. 

When ADM populates its Tasks table, each queued record is subsequently processed by ADM and removed from the Tasks table as it completes processing that record.  The ADM task execution is a generally slow process (1 contact processed every 2-3 seconds); we closely monitor the progress with a SQL query:

With this approach (in addition to SHRINK and REINDEX operations between batches), we have seen the necessary disk size reduction of both xDB shard DBs after running a cadence of several batches.  

However, we ran into a snag in a recent batch, which resulted in the entire ADM task processing halting entirely.  The issue appeared to directly correlate with general Azure Maintenance operations, which had occurred over the weekend while the batch was mid-process.  Azure Maintenance updates typically happen without any advanced notice or warning.  Usually, Azure Maintenance operations have minimal adverse effects, but this round seemed to have caused much of the infrastructure to spiral.  We observed that the ADM tasks were no longer processing when all was said and done.  

Attempts to re-start the job via ADM kept resulting in the same error:

"[ADM] Response from xConnect did not indicate success. Status code: BadRequest, Message: {\"Message\":\"The remove task can't be started while another one is running.\"}"

Upon initial analysis, we noted that the ADM tasks table was still populated with IDs that had yet to be processed when the operation was cut off. I began dissecting the ADM binary files for clues - specifically in search of the message "The remove task can't be started while another one is running".  

I learned that the StartContactsDataRemoving method queries an IsRunning method to determine if any other tasks are in progress. If there are, it throws a BadRequest
response and returns the "The remove task can't be started while another one is running.
message" message. 

Digging deeper led me to this ClearRemoveDataSettings method - called in the StopRunningTasksAndClearStorage method.  Deeper in, there are references to a PropertiesRepository class and an object name of "RemoveDataSettings" used to store task information:

This, in turn, finally led me to a PropertyValueQuery method in a PropertiesRepositoryQueries class, which contained a SQL command used as part of the process:

We reviewed the current state of the ADM Properties table within the ADM DB and found three entries, including RemoveDataSettings:

The RemoveDataSettings record's value appeared to be a JSON representation of ADM's last ADM removal task run.  However, the JSON representation was cut off after a few hundred characters.  With this state of the present value, ADM was convinced that the task was not completed.  

Following the approach used in the code (mimicking what should occur when an ADM removal task is completed), we ran the following command:

We also entirely cleared the remaining IDs and Tasks table and re-initialized the process.  With these steps, our ADM tasks were back to processing as expected.

I hope this one helps anyone in a similar situation!