A tiny, cross-browser script to intercept third-party JavaScript injection via document.write()

Many of us have encountered third-party scripts which use document.write(), particularly for script injection. Certain older analytics scripts use this approach ({cough} {cough} {Omniture!}). If you’re stuck maintaining an older site with a projected end-of-life, you may not have the time to upgrade those old bad scripts away, especially if they’re hosted externally and dynamically generated.

Unfortunately, many of those same scripts were written without async loading in mind, and by render blocking, blocking on the DOM or CSSOM, etc. can greatly decrease performance of your site; a second-plus of unnecessary page-load time is not uncommon. The problem is so pervasive that Google finally put its metaphorical foot down, ever so lightly, by decreasing support for third-party document.write() statements; as an immediate effect of this change, certain scripts may not even be loaded on mobile, defeating the purpose of loading the scripts in the first place.

This can leave us in a pickle: how to intercept document.write() calls by these third-party scripts without disabling analytics or other functionality for which they were included in the first place, ideally while decreasing page-load times through the use of async/defer loading?

This can be done by replacing document.write with a function that intercepts attempts to inject scripts in this way, and instead adds them dynamically to the DOM, while passing through all other non-script parameters to the original document.write() function:



// 1. Moves the document.write() method, for safekeeping
document.writeText = document.write;

// 2. Assigns a new function to document.write(), to serve as a middleman
document.write = function(parameter) {
    if (!parameter) return;
	
	var scriptPattern = /<script.*?src=['|"](.*?)['|"]/;
	if (scriptPattern.test(parameter)) {
		// Get the value of the src attribute
		var srcAttribute = scriptPattern.exec(parameter)[1]; 
		// Add the script to the DOM 
		var script = document.createElement('script');
		script.src = srcAttribute;
		document.head.appendChild(script); 
	}
	else {
		document.writeText(parameter);
	}	
};

The last piece of the puzzle, if you’re also interested in decreasing page-load time with these older libraries, is to load them asynchronously. Options here would include the following:

1. Use of libraries such as postscribe , jQuery etc., to help with async-loading external scripts in general.

2. Using the async and/or defer properties directly on the script tag including a legacy library (I’m unaware as of this writing of a way to do this cleanly with cross-browser support using the DOM and avoiding use of innerHTML). This may be recommended against, as discussed further below after #3.

3. Especially in the case of a legacy script which is relatively lightweight itself but uses document.write() extensively to inject script tags, you could allow the first script to load synchronously, but use the technique above to write the script tags yourself, adding async and defer as desired. This may significantly decrease page load time, because the bulk of render-blocking script loads are avoided. To use this technique, modify the code sample to use document.writeText()–or whatever you’ve called your placeholder for the old document.write()–to emit an HTML script tag, using the regex-parsed src value and adding adding “async defer” directly.

Adobe in particular has consistently recommended against loading DTM libraries async, and this legacy can be seen stretching back to the days of the fore-mentioned Omniture, which unfortunately does use document.write() extensively to load other scripts from dynamically generated code. Thus a working approach to get at least some such scripts loaded asynchronously, while avoiding page flicker, can be to leave the script-loading legacy script in place as-is, but intercept its calls to document.write().

Advertisements

Global.asax, Keeping the Magic Alive

In my efforts to retrofit an old Sitecore Web Forms application for caching which is safe for use with postback, etc. in an elegant way, I needed to review the full set of “magic” methods available in the Global.asax application file, which ASP.NET wires up at runtime.

As a reminder, make sure that you’ve included a script runat=”server” tag enclosing your code–you may have to restore this if deleted from, or never added to, an empty file. Confusion abounds on the web as to whether Global.asax works with ASP.NET MVC (it does), primarily because of this missing script tag.

The application- and session-specific event methods are:

Application_Start
Application_End
Application_Error
Session_Start
Session_End

The request-specific events are:

Application_BeginRequest
Application_AuthenticateRequest
Application_AuthorizeRequest
Application_ResolveRequestCache
Application_AcquireRequestState
Application_PreRequestHandlerExecute
Application_PreSendRequestHeaders
Application_PreSendRequestContent
Application_PostRequestHandlerExecute
Application_ReleaseRequestState
Application_UpdateRequestCache
Application_EndRequest

Sources:
https://web.archive.org/web/20071223170129/http://articles.techrepublic.com.com/5100-10878_11-5771721.html
http://sandblogaspnet.blogspot.com/2008/03/methods-in-globalasax.html

An Aspect-Oriented Programming (AOP) Approach to Logging

New: Dynamically evaluate C# expressions and execute C# scripts with a single statement, from anywhere in a .NET application. Click here for more info.

Logging is a topic near and dear to my heart, having in an earlier version of .NET created a logging package tuned for high performance and used many others since. Today, with multiple popular offerings available to the .NET developer with different strengths and weaknesses (NLog, Log4Net, Serilog et al.) it’s not unusual to see adapters in local codebases to allow configuration and use of different packages as desired. This is actually a practice I recommend, to decouple local APIs from the implementation of a solution to a common and cross-cutting concern.

This naturally leads to thoughts of simplifying access in one’s API to the logging code. Aspect-oriented logging has in the past included attribute-based approaches, such as in PostSharp. But what if one hasn’t adopted such a library, or would like to log statements from code inside existing methods?

Suppose that one has logging classes presumably configured using some IoC implementation, and wants to decorate an API with logging functionality without undue clutter. One can use C# extension methods and weak references together with a marker interface to achieve the desired effect. Here are the steps:

1. Create an interface with which to decorate classes that will generate log information. (In the linked code sample, see the ILogSource interface.)

2. Add extensions to the log-source interface to support logging messages and/or events, corresponding with the desired use of the logging API, and to get and set a logger using a weak reference. (In the linked code sample, see the ILogSourceExtensions static class, stored with ILogSource in ILogSource.cs).

3. Decorate any desired class with logging functionality by implementing the marker interface, and configuring it with a logger as desired, then calling its logging methods within its other code. (See the code sample for more.)

This approach still allows an adapter to a target logging API to be used, and run-time configuration of the implementation as desired. It merely provides syntactic sugar to avoid littering your API with logging substructure in base classes and the like, by reducing the necessary plumbing to a single interface marked on the logging client class. The overhead of looking up the weak-referenced logger turns out to be minimal, at several nanoseconds per call. I’m still thinking through how best to wire this together with injection; constructor injection seems to obviously be out of the question.

Cure YSOD in the Sitecore Template Inheritance Tab

New: Dynamically evaluate C# expressions and execute C# scripts with a single statement, from anywhere in a Sitecore or .NET application. Click here for more info.

I recently encountered a Yellow Screen O’ Death (YSOD) error when attempting to use the template inheritance tab while viewing a template in the Sitecore Content Editor. As it turned out, the culprit was a template field with a blank type. To find such fields, run the following query to find the fields, then set their types:

/sitecore/templates/User Defined//*[@@TemplateKey = 'template field' and @Type='']

How to use the File Explorer and XPath Builder in Sitecore 8.1

New: Dynamically evaluate C# expressions and execute C# scripts with a single statement, from anywhere in a .NET application. Click here for more info.

Later versions of Sitecore don’t provide an easy way to launch the Sitecore File Explorer and XPath Builder from the web, but they’re still there. In order to use them, go directly to the following URLs on your instance, and enjoy.

/sitecore/shell/default.aspx?xmlcontrol=FileExplorer
/sitecore/shell/default.aspx?xmlcontrol=IDE

Extension methods + weak references = extension pseudo-properties in C#

New: Dynamically evaluate C# expressions and execute C# scripts with a single statement, from anywhere in a .NET application. Click here for more info.

When the below was first written, extension properties were just wishful thinking. However, check out the upcoming “Extension Everything” features slated for C# 8. Unfortunately, additional state is apparently not yet to be supported directly, but at least the property syntax will be cleaner; state can be dummied up as necessary using the below approach.

Note: A full reference implementation including extension methods for both setting/getting extra item data, as well as caching, is available at the SharpByte project. See ObjectExtensions.cs.

Extension methods can be helpful for adding functionality onto existing classes, especially where one doesn’t have the ability to control a class definition and thus can’t add the methods directly. Adding extension properties to C# could be just as useful in some programming scenarios, but so far isn’t slated for definite release. Yet when working with third-party code such as the Sitecore API, it can be especially useful to add on both post-hoc functionality and state. And while extension methods can’t precisely duplicate the syntax of properties in C#, they can come close through the use of getter/setter methods as in Java, if some facility is used to store data on the object.

The System.Runtime.CompilerServices.ConditionalWeakTable class is ideal for such use, as a collection specifically made to contain weak references, i.e. to allow any referred-to object to be garbage collected (releasing the weak reference as well) if all strong references have been removed. What this means on a practical basis is that one can provide ancillary state for an object by using such a weak-reference collection in a static, and ideally thread-safe, way. It’s exactly what’s needed to dummy up extension properties in C#, while we wait for the real deal from Microsoft.

Here’s a basic example:

using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;

public static class ExtendedDataExtensions
{
    ///<summary>Stores extended data for objects</summary>
    private static ConditionalWeakTable<object, object> extendedData = new ConditionalWeakTable<object, object>();

    /// <summary>
    /// Creates a collection of extended pseudo-property values
    /// </summary>
    /// <param name="o">The object to receive the tacked-on data values</param>
    /// <returns>A new dictionary</returns>
    internal static IDictionary<string, object> CreateObjectExtendedDataCache(object o)
    {
        return new Dictionary<string, object>();
    }

    /// <summary>
    /// Sets an extended pseudo-property value on this object
    /// </summary>
    /// <param name="o">this object</param>
    /// <param name="name">The pseudo-property name</param>
    /// <param name="value">The value to set (if null, any value for the name will be removed)</param>
    public static void SetExtendedDataValue(this object o, string name, object value)
    {
        if (string.IsNullOrWhiteSpace(name)) throw new ArgumentException("Invalid name");
        name = name.Trim();

        // Gets the key-value collection serving as extended "properties" for this object
        IDictionary<string, object> values = (IDictionary<string, object>)extendedData.GetValue(o, CreateObjectExtendedDataCache);

        if (value != null)
            values[name] = value;
        else
            values.Remove(name);
    }

    /// <summary>
    /// Gets a pseudo-property value stored for this object
    /// </summary>
    /// <typeparam name="T">The type to return</typeparam>
    /// <param name="o">this object</param>
    /// <param name="name">The pseudo-property name</param>
    /// <returns>A value of the indicated type, or the type default if not found or of a different type</returns>
    public static T GetExtendedDataValue<T>(this object o, string name)
    {
        if (string.IsNullOrWhiteSpace(name)) throw new ArgumentException("Invalid name");
        name = name.Trim();

        IDictionary<string, object> values = (IDictionary<string, object>)extendedData.GetValue(o, CreateObjectExtendedDataCache);
        object value = null;
        if (values.TryGetValue(name, out value)) 
        {
            if (value is T)
                return (T)value;
            else
                return default(T); // or throw an exception, as desired
        }
        else
            return default(T);
    }

    /// <summary>
    /// Gets a pseudo-property value stored for this object
    /// </summary>
    /// <param name="o">this object</param>
    /// <param name="name">The pseudo-property name</param>
    /// <returns>A value if found for the specified name, otherwise null</returns>
    public static object GetExtendedDataValue(this object o, string name)
    {
        if (string.IsNullOrWhiteSpace(name)) throw new ArgumentException("Invalid name");
        name = name.Trim();

        IDictionary<string, object> values = (IDictionary<string, object>)extendedData.GetValue(o, CreateObjectExtendedDataCache);
        object value = null;
        if (values.TryGetValue(name, out value))
            return value;
        else
            return null;
    }
}

A brief example of calling the methods:

Item item = Sitecore.Context.Item;
item.SetExtendedDataValue("Tweet Count", 300);
// ...
int tweetCount = item.GetExtendedDataValue<int>("Tweet Count");

… after which one’s thoughts naturally turn to wrapping things up a bit more nicely, seasoning to taste for such issues as null and range checking. An inane example, for want of a better one:

public static class ItemExtensions
{
    public static int GetTweetCount(this Item item)
    {
        return item.GetExtendedDataValue<int>("Tweet Count");
    }

    public static void SetTweetCount(this Item item, int tweetCount)
    {
        item.SetExtendedDataValue("Tweet Count", tweetCount);
    }
}

And, of course, if it’s desired to make the pseudo-properties thread-safe, merely replace the dictionary storage with ConcurrentDictionary, with nearly identical performance.

ORMs for Sitecore: Glass Mapper and, well, the rest of ’em

Anyone who’s been in modern business computing for any real length of time has likely been presented, and in multiple contexts, with the challenge of storing and retrieving object data to/from an underlying data store. Object-relational mapping tools can make this easier and lead to cleaner, more maintainable code–but the devil’s in the details of the ORM implementation, as briefly explored below.

One might be tempted to think that any ORM is a good ORM, at least as offering general advantages over naked item access, but in practice this is not always true. I’ve found that direct API access to items is far more convenient, reliable and maintainable than a badly-implemented ORM. Type safety is not the only measure of maintainability of object-oriented code which access a data store. Efficiency of coding, clarity of the object structure, and ease of maintenance–which cannot be measured by speed of generating a code template in the first place, but must include an assessment of how long updates may take–can be more important on a practical basis.

In general, use of a well-maintained de facto standard such as the Sitecore Glass Mapper will offer many advantages, from the design of the ORM code itself, to less cluttering of the domain model, to decreased ramp-up time for developers new to the team. I’ve learned this the hard way, and of all home-grown toolsets, ORMs can lead to the most disadvantages long-term, perhaps only rivalled these by badly-written local CMSes. Eventually, one simply has to cut the cord.

Item-wrapping, homegrown ORMs can be tantalizingly easy to write, but lead to drawbacks

At one company a sprawling codebase, put together by different developers over time and using different styles and approaches, included the use of a code generation tool designed to be used as an ORM. The way this tool worked, one would invoke the tool, template data would be loaded, and a basic interface and class file would be generated. One could copy the code into one’s working directory and include it in a project. Fields were wrapped for type safety. Easy peasy, right?

Code updates with an item-wrapper-generating ORM, whenever a corresponding structure changes in the CMS, can be a major hassle.

Not exactly. Code updates with an item-wrapper-generating ORM, whenever a corresponding structure changes in the CMS, can be a major hassle. If any changes have been made to the object code since initial auto-generation, updates necessarily involve merging (often manually) a newly-generated code file with the older, customized file, so that crucial information isn’t lost.

In my experience, a typically clunky implementation also tends to obscure the natural class and inferface model of the code, introducing clutter and bad design for no reason. So, for example, a typical implementation may have a particular code file, based on a Sitecore template which extends another template, only include that item’s fields. “But wait,” one may think, “isn’t that okay, as long as any inherited templates can be generated as well, and the subclasses linked to parent classes/superclasses?” That is certainly true, except that, regrettably, often homegrown and other badly-done implementations don’t do this easily or at all, leaving the developer to create–and reapply in case of changes and code re-generations, commonly leading to time-wasting maintenance errors–those linkages.

In extremely bad cases, where a generated wrapper class contains field ID-name mappings as well, those are not cleanly reflected in the inheritance model; I’ve seen auto-generated field names hiding ones from inherited classes/templates, i.e. inherited field names not being available on subclasses, obscuring the programming model. In my experience, a naively made ORM may be accompanied by code accessing item fields directly using these names, either through sheer bad programming practice by subsequent developers befuddled by the arrangement, or because of the need to work around limitations of the ORM. And, to cap it all off, direct item and field access may only be available with clunky-looking calls to methods or properties, if at all; and the ORM itself may begin to take on the character of a badly constructed API, where its base classes have loads of badly designed “helper” code.

A main problem with a badly written ORM is that it typically obscures the domain object model, which out of the entire codebase can be the most important to keep clean. By far the best maintainability of ORM-using code results from a tool which can at least keep any mappings outside of the domain model proper, and which ideally leads to a clear, declarative programming style. (In Sitecore, essentially an ASP.NET website providing a C# API, one’s thoughts naturally turn to the use of attributes; this is a clear advantage for the Glass Mapper, discussed below.)

The main problem with a badly written ORM is that it typically obscures the domain object model, which out of the entire codebase can be the most important to keep clean.

Unfortunately, the more an old, home-grown ORM is used, the more it may become part and parcel of an entire local codebase, taking major effort to extract. Even worse, developers who created the ORM may become emotionally invested in its continued use, resisting change. In all cases it is, in my experience and opinion, best to nevertheless adopt a more modern approach. Otherwise, the technical debt simply continues to increase over time, spread over many deliverables where it is not easy to track.

The Custom Item Generator: somewhat better, but only just

The Sitecore Custom Item Generator is a free module available on the Sitecore Marketplace, which essentially takes the item-wrapping approach described above, but is of higher quality than typical homegrown item-wrapping ORMs in my experience. When I first began working with Sitecore, it was on a codebase making extensive use of the Custom Item Generator, and at least it did not involve a steep learning curve.

The CIG was also implemented at a time when the Glass Mapper was not available, and in an honest attempt to make life easier for developers, so it’s best viewed charitably. Still, it offers many drawbacks which need not be explored to death here, but again notably include the need to manually regenerate classes in case of Sitecore data model changes, and then manually merge the updated auto-generated code with any customizations in the codebase. I’ve actually seen code making direct item access which was cleaner and more maintainable than CIG-using code; especially with a well-written set of extension methods, item access can actually be fairly simple and straightforward, and essentially self-documenting.

Writing the C# code for a cleanly designed class and interface model is typically the least of a developer’s worries, and usually involves a negligible amount of time.

There are other code generation tools which work with Sitecore items, such as Hedgehog TDS Code Generation. They essentially work by accessing a template via the Sitecore API, then generating a corresponding C# code file, and as such may be seen as helpful in initially generating interfaces and classes. However, these tools should not be confused with the advantages of an ORM, and can still offer drawbacks such as necessitating merges with updates. Writing the C# code for a cleanly designed class and interface model is typically the least of a developer’s worries, and usually involves a negligible amount of time.

Enter Glass, the greatest thing since sliced bread

The Sitecore Glass Mapper was written with all of these past shortcomings in mind, and designed to eliminate them. Sliced bread’s pretty great, but can it magically pull up item data from Sitecore (or Umbraco) and map field data to type-safe properties, and support looking up related item data with a declarative syntax, while preserving your clean object model and isolating it from rework in case of changes to the data model? No sandwich can do such a thing, no matter how cleverly made.

The benefits of Glass are fairly well-documented at the project’s site, and thus I won’t exhaustively get into them here (and though the documentation used to be a bit outdated in parts, it’s undergone an upgrade).

On one large project I recommended adopting Glass for new development, where previous Sitecore work had used the Custom Item Generator. This particular project featured a fairly detailed set of classes and corresponding set of templates and branch templates, to simplify creating a new instance of a particular structure, which would happen on a frequent basis. Elements within the subtree needed to be able to refer to each other at their relative locations, as well as data in a shared global area.

On that project and others, the ability to use Sitecore Query expressions in Glass attributes simply worked wonders. It became a snap to wire the entire object structure together and pull the data up correctly, without the need for custom development; the work of generating the object structures in memory became as simple as declaratively including a few paths, which again made the project nearly self-documenting. Providing reusable functionality with the use of a base Glass class and extension methods was a snap. Especially in this project’s read-heavy typical use pattern (which, let’s face it, underlies the entire architecture of a web CMS like Sitecore), the Glass Mapper’s caching support worked well too.  Work has now also been done toward letting users optionally pull Glass data from a Lucene index.

I saw first-hand the massive improvements in ease of maintenance and overall code simplicity that an ORM like Glass offers…

On that first large project using Glass, I saw first-hand the massive improvements in ease of maintenance and overall code simplicity that an ORM like Glass offers when compared with item-wrapping approaches. Since then I’ve run across occasional situations where Glass loading code would fail to locate a class across boundaries, such as within a WCF service, but each time the problem was straightforward to understand and solve, so I consider Glass to have relatively few drawbacks. Anyone implementing Sitecore for the first time would do well to investigate the Glass mapper.

Mixing in social media while avoiding mixed content warnings

If you’ve ever had the thrill of integrating external content into your website, you’ve likely run across the mixed content warning issue. In short, one can link to non-secure content from a secure page, but anything that would result in content being loaded from a non-secure source (a common example being an image URL) will likely cause a mixed content warning of some type in a user’s browser, when the page is served over HTTPS.

It’s generally fine to load HTTPS content in an HTTP page, of course. This means that when including content in a page, one can consider either replacing HTTP links with HTTPS ones, or using protocol/scheme-relative URLs. If a resource can be served over HTTPS, it’s a good practice to use HTTPS URLs at all times in website content. This avoids the problem of protocol-relative URLs when a resource either cannot be served over HTTPS (or, sometimes, HTTP), or where the URL is different depending on the protocol.

That last problem is rare, but unfortunately not non-existent. A prominent example occurs with Pinterest, which serves each pin’s images over both HTTP and HTTPS–but when using the latter, one must include an extra “s-“, for example:

https://s-some-really-long-url-stuff/and-more-stuff-etc.jpg

Unfortunately, when retrieving results using the Pinterest API, URLs for images (for both avatars and pins) are returned only in the non-secure flavor. Thus for Pinterest-API content included in a page presented over HTTPS, URLs should have the protocols switched to HTTPS, but also the extra “s-” must be added.

Luckily, most of the other big social media sites (Facebook, Twitter, YouTube) serve images at URLs returned by their various APIs just fine via HTTPS, with no funky differences between URL formats for HTTP and HTTPS.

Avoid a FOUCed content delivery experience using CSS and JavaScript

A Flash of Unstyled Content (FOUC) problem occurs when unstyled content is displayed in its raw form during page load, then later laid out as desired. This can be a problem in a Sitecore site as in any other; recently we encountered this while using the jQuery Exposure plugin.

To avoid this, try placing the following, or a functional equivalent, in the <head> section of your layout, after the script tag for jQuery itself:

<style>
.no-fouc {
    visibility : hidden ;
}
</style>

<script>
    $(function () { $( ".no-fouc" ).removeClass("no-fouc"); });
</script>

Then simply apply the no-fouc class to an appropriate container of the elements causing the problem.