Best Practices for Windows Media Encoding

An article written by Ben Waggoner, February 21, 2007 at Streaming

Microsoft's Windows Media Video has been a leading web video format for many years, but good hands-on information for how to get the best results out of it hasn't always been easy to come by. Also, Windows Media has an enormous breadth of ways it's used, scaling from mobile phones to the PC to CE devices like the Xbox 360. This article strives to codify the most important best practices to get the optimum Windows Media encode, whatever the source and delivery environment.

Update to the Latest Codecs

The first and easiest best practice is to make sure that you have the current versions of the codecs installed on your encoding box. We've recently released a number of backwards-compatible enhancements to the codecs that offer broad improvements for anywhere Windows Media is used. The video codec has seen impressive, fully backwards- compatible improvements. Our codec is now 4-way threaded, meaning it can use all the cores in a dual-core, dual-socket workstation, or one of the new single-socket quad-core systems. Beyond that, there are a variety of general improvements in both performance and quality. More details can be found in "Using Windows Media Registry Keys," pp. 30–32 of the November 2006 issue of Streaming Media. Beyond video, there is also a new version of WMV 9 Advanced Profile, the new WMA 10 Pro, and enhancements to WMA 9, all detailed below.

This article refers to the new and updated versions of WMV 9 as "Version 11" or "v11." There are four ways to get the v11 codecs:

Video Codecs

First, a note on VC-1. There's been some confusion on the relationship between the Windows Media Video codecs and the SMPTE (Society of Motion Picture and Television Engineers) VC-1 spec. VC-1 is the SMPTE designation for their standardization of WMV 9. In essence, you can now think of Windows Media Video 9 as Microsoft's brand for our implementation of VC-1, as implemented for advanced streaming format (ASF) files.

The VC-1 Simple and Main Profiles are progressive, and hence part of the existing WMV 9 profile. The Advanced Profile requires the new WMV 9 AP implementation. Note that the older and rarely used WMV 9 Complex Profile was not included in the VC-1 spec, and should no longer be used (although files using it are still supported for playback).

Windows Media Video 9
Windows Media Video 9 (WMV 9) has been a popular, mainstream choice for web video applications since the WM 9 ("Corona") launch back in 2002. We've been able to provide several generations of backwards-compatible enhancements to both quality and performance, so today you can encode WMV 9 both faster and with higher quality than ever before.

In general, the only time you'd use a codec other than WMV 9 for web video use is if you were trying to stream native interlaced content (and hence using WMV 9 Advanced Profile, described below) or screen capture content (potentially using WMV 9.2 Screen, also below).

There are two profiles supported in WMV 9—Main Profile and Simple Profile. For computer playback, WMV 9 Main Profile is the optimum choice. Simple Profile is a simpler version of the codec that targets lower-powered mobile devices, like mobile phones. This simplicity means it requires less horsepower to play back, but it also requires more bits to provide equivalent quality to Main Profile since Main has so many more tools it can use to improve compression efficiency. That said, many mobile devices, including those running Windows Mobile 2003 or higher, will decode Main Profile, although they'll need a lower bit rate for it.

Many encoding tools don't provide a profile control, and only do Main Profile.

Windows Media Video 9 Advanced Profile
Windows Media Video 9 Advanced Profile is a non-backwards compatible enhancement to WMV 9, focused primarily on much improved support for encoding content as interlaced, with other enhancements helpful for IPTV and HD DVD. For most web video, neither are needed.

WMV 9 AP was introduced with WMP 10 and the Format SDK 9.5 back in 2004. However, there have been a few format tweaks in the v11 implementation, meaning WMP 9 and 10 will require a codec download to play back v11-encoded WMV 9 AP.

Windows Media Video 9 Screen
The WMV 9 Screen codec is designed for efficient compression of screen recordings. It's a special-use codec—very efficient for this task, but not appropriate for encoding normal video. Screen is the most efficient with simpler, flat graphics, so you can get efficient encoding out of the "Windows Classic" theme in XP and Vista. However, richer graphics environments, especially Vista, include a lot of gradients and transparencies, and can often encode more efficiently using standard WMV 9.

For efficiency, WMV Screen needs access to the uncompressed RGB source video of the screen shots, and it doesn't work well when compressing from screen shots that have had any lossy encoding already applied to them. It's typically used in conjunction with lossless screen recording products like TechSmith's Camtasia. On a fast computer, it's possible to use Windows Media Encoder to record, or even broadcast, live screen activity with the Screen codec.

Compared to other screen capture codecs, WMV 9 Screen is unique in providing full support for 2-pass VBR and CBR encoding, making it possible to use it for real-time streaming.

Windows Media Video 9.1 Image
The WMV 9.1 Image codec is designed for video sequences made out of individual still images, including transitions. For this special class of content, Image can be quite a bit more efficient than WMV 9, but it is much less efficient for typical motion video sources.

Audio Codecs

Windows Media Audio 9.2
Windows Media Audio (WMA) 9.2 is a fully backwards-compatible upgrade to the venerable WMA standard. It's compatible back to WMP from the mid-'90s (predating ".WMA" files!), and can be played back virtually anywhere. The 9.2 version includes some minor (but welcome) performance and quality enhancements compared to the previous version.

WMA is the general, safe codec choice for any Windows Media file. It's flexible, offers high quality with sufficient bit rate, and is playable by anything that can play a .WMV file. However, there are some scenarios where Voice or WMA 10 Pro can offer better performance.

Windows Media Audio Voice 9
WMA 9 Voice is designed for low bit rate applications below 32Kbps, where WMA and WMA Pro don't perform as well. Despite its name, it does a credible job with music and other non-voice content. You're not going to dance to WMA Voice at low bit rates, but it can intelligibly compress music interludes and sound effects in otherwise voice-centric content.

WMA Voice 9 is supported in WMP 9 and higher. Note that Voice is not currently supported in all non-Windows players. Most notably, it isn't supported in the current versions of Flip4Mac or the Kinoma player for PalmOS.

WMA Voice replaced the now deprecated audio codec, and should be used instead.

Windows Media Audio 10 Pro
While the new video codec features of WMV are exciting, the biggest technical leap is the new Windows Media Audio 10 Professional codec (WMA Pro 10), which offers up to twice the improvement in compression efficiency compared to WMA 9. The original WMA Pro 9 has been around for several years, and it offers great audio quality and efficiency at 128Kbps and up. WMA Pro 9 supports up to 7.1 channels, up to 24-bit sampling, and up to 96KHz frequencies. But the high minimum bit rate (128Kbps) took it out of the running for most web video tasks. Thus, most streaming video projects have kept using good old WMA 9, which provides good support for lower bit rates.

With WMP 11, we've added a new frequency interpolation mode to WMA Pro, and incremented the name to WMA 10 Pro. With the new mode, we encode a "baseband" version of the audio as normal WMA 9 Pro at half the selected frequency, with additional data that tells how the higher frequencies are added. This gives a stream that's backwards-compatible with the existing decoder, but provides enhanced quality with an updated decoder (bundled with WMP 11).

WMA Pro 10 provides up to two times the efficiency of WMA 9.2, so at 64Kbps it can provide similar quality to WMA at up to 128Kbps. However, if only the old decoder is used, you only get the lower-quality baseband audio.

We feel WMA 10 Pro is the best audio codec included in a major streaming platform today. It handily beats our older codecs, as well as MP3, AAC-LC, and even the new HE AAC-LC, as demonstrated in this study— Warning: PDF.

WMA 10 Pro is appropriate to use once the majority of your customers are using WMP 11 or another player that supports the full codec. Then at the same bit rate, a WMA 10 Pro stream with the older, non-upgraded decoder can sound a little worse than a WMA 9.2 version. Because it's an enhancement of the older codec, WMA 10 Pro won't trigger a codec download—users only get the new codec if they install WMP 11, Format SDK 11, or if they're running Windows Vista.

The biggest place where WMA 10 Pro is being used today is with Verizon's V CAST mobile media service. We're also working with over a dozen additional vendors to add WMA 10 Pro support to their devices.

Windows Media Audio 9.2 Lossless
The WMA 9.2 Lossless codec is, as the name implies, a lossless audio codec. A lossless audio codec's output is bit-for-bit identical to its input. Essentially, it's a more efficient alternative to PCM (uncompressed) encoding, and functionally equivalent to Zipping up a .WAV file.

The flip side of lossless encoding is that there's no bit rate control possible—each second of audio takes as many or as few bits as it needs. Hence the codec is only available in Quality VBR mode. Perfect silence takes up very little bandwidth, while white noise takes up as much as uncompressed. Typical savings are around 2:1 for music and 4:1 for TV/movie soundtracks.

In general, WMA 9.2 Lossless shouldn't be used for WMV files for streaming, obviously, as they are CBR. And WMA 9 and 10 Pro can provide incredible sounding audio at lower bit rates, and transparent compression at lower bit rates than WMA Lossless.

Data Rate Modes

Constant Bit Rate (CBR)
In essence, a CBR file is one where the average bit rate (ABR) and peak bit rate (PBR) are identical. This is appropriate for when peak bit rate is the primary constraint, required with real-time streaming, and typically used with devices where the speed of the decoder is the limit.

However, the constant rate is within a certain window of time, the buffer duration. So, a 200Kbps clip with a five-second buffer means that any five seconds of the file has to be at or below 200Kbps per second. But any individual second could go quite a bit higher than that.

1-pass CBR is required for live streaming (webcasting), of course. It's also the fastest bit rate-limited mode available in Windows Media (the bit rate-limited VBR modes are 2- pass only). Note that 1-pass CBR encoding can be significantly improved by the use of the Lookahead registry key parameter, described in a document linked at the end of this article.

2-Pass 2-pass encoding essentially lets the encoder see into the future, ramping the bit rate up and down as needed to account for future changes. With the v11 codecs, it always provides at least as good quality as 1-pass CBR, and will often provide significantly more consistent quality with variable content. And with the v11 codecs, the first pass is much faster than the second pass or a single pass would be, so going to 2-pass doesn't double encode time—it's more typically a 20% increase. So, the 2-pass should generally be used instead of 1-pass for CBR except for live streaming.

Variable Bit Rate (VBR)
The essential difference between CBR and VBR is that VBR files have a peak bit rate higher than the average bit rate. This lets a VBR file be more efficient for file size, since it can distribute bits throughout the file to provide optimal quality. You can also think of the difference as "CBR maintains bit rate by varying quality, and VBR maintains quality by varying bit rate."

1-Pass Quality-Limited VBR
1-pass quality-limited VBR is a pure VBR—you just specify the quality, and each frame takes as many bits as needed. The final file size can vary tremendously depending on complexity, and there's no limit on peak bit rate at all. Obviously, this makes quality- limited VBR a poor choice for content distribution. However, it's a great mode for archiving content, and since it is 1-pass, it can be captured in real time on a sufficiently powerful box.

2-Pass Bit Rate VBR (Constrained)
2-pass bit rate VBR (constrained) is the optimal encoding mode for when the playback environment can handle a peak bit rate higher than the average bit rate. This is typical of content that is distributed as files instead of streams, be it off a web server or a file on a disc. Almost every web video WMV file not targeted to run on Windows Media Services should be using Constrained VBR.

With previous versions of the codec, we recommended that 1-pass CBR be used for HD content, since there was a problem with occasional dropped frames with 2-pass at high bit rates. This issue has been addressed in the v11 codec, and 2-pass VBR can now be used for HD content. This can result in big savings in file size.

2-Pass Bit Rate VBR The unconstrained 2-pass bit rate VBR mode is just the same as constrained VBR, but without any peak constraint. This means that playability isn't predictable, and n most cases the constrained mode should be be used instead.

General Parameters

Data Rate
Probably the most critical parameter for any WMV file is its data rate. This determines how big the file is, and how much bandwidth a user would need to stream it. By default, Windows Media measures data rate in kilobits per second (Kbps). To figure out how big a file at a particular data rate is going to be, use this simple equation: the size in KB equals the data rate (in Kbps) divided by 8 times the duration in seconds. The "8" is in there as data rate is in kilobits per second, but file size is in kilobytes per second. I use "b" for bits and "B" for bytes as shorthand.

Note that Windows Media, like most streaming platforms, uses the correct base-10 metric values for K, M, and B, not the base-two derived values often but erroneously used for shorthand in computing.

To correctly predict the file size, note that you're calculating accuarate KB sizes, while most operating systems list file sizes in KiB/MiB/GiB by default. To get the real value, see the file's Properties and look at the value in bytes. For more information see this article.

Peak Buffer Size
The buffer-size parameter is available for all bit rate-limited modes. It specifies the duration (in seconds) of the window over which average (for CBR) or peak (for VBR) bit rate is calculated. Bigger buffers can provide more efficient encoding, but make hardware decode more complex and increase latency. For real-time streaming, the maximum startup and random access latency can go up by up to a second for each additional second of buffer.

Peak Bit Rate
Only used in bit rate VBR (peak), the peak bit rate value indicates the maximum bit rate during the peak. Higher values improve the quality of difficult sequences, but also increase the complexity of the decode. A simple way to determine appropriate bit rate is to encode in CBR with the same parameters, and find the highest data rate that will play back reliably. You can then use that as the peak bit rate.

Keyframe Interval
Keyframe interval determines the maximum time between keyframes. A keyframe (also called an I-Frame, for independent or intra frame) is a fully self-contained frame with no other frame based on it. These are the only frames that can be immediately jumped to in random access.

More frequent keyframes make random access, like scrubbing through a file or startup for a streaming file, faster. However, they also reduce efficiency, since a typical keyframe needs more bits to provide equivalent quality compared to normal frames. Note that the codec will insert additional keyframes as needed throughout the file, at points where the video changes dramatically.

The quality parameter used in the CBR and 1-pass (quality) VBR modes controls the target quality for the video. In quality VBR, it simply controls the quality of each frame, and hence how many bits it gets relative to its difficulty—higher values look better, but take a lot more bits.

In CBR, it controls emphasis on spatial versus temporal quality, essentially setting a minimum bar on frame quality. Lower values let the quality of video frames drop as low as needed in order to hit the target data rate without dropping frames. Higher values set a higher minimum frame quality, and the codec will drop frames on encoding in order to ensure the target bit rate and quality targets are maintained.

Using low quality settings can result in poorer video quality, even when the video is in no danger of dropping frames. This is because the codec doesn't always perfectly adapt to future changes in the video. In general, for most content you want to find the highest value that doesn't drop frames. The v11 codecs, especially if you're using the new registry key settings, can be a lot more efficient than previous iterations, so you can use a higher quality value with the same source and settings than before. I think 75Kbps is a good starting value for most web video now. If you're using Windows Media Encoder, it'll report after the encode if there were any dropped frames, a handy way of seeing if you're using too high a setting. WME also provides reasonable defaults to start with.

Encoder Complexity
Encoder complexity controls how hard the encoder works to encode the video. There are six levels in the v11 codecs with about an eight-fold difference in encoding speed from slowest to fastest, and about a 20% improvement in compression efficiency from fastest to best (which actually isn't the slowest). Different tools present the option in different ways. In Windows Media Encoder, it is set in Tools > Options > Performance. For optimal quality, the second-to-slowest mode (listed as "4" or "80" in some tools, and second last to the left in WME) is almost always the best choice. The slowest mode is a lot slower and very rarely produces any measurable improvement in quality.

Intelligent Streaming
Intelligent Streaming is Microsoft's name for our multiple bit rate (MBR) encoding system. The idea behind MBR is to provide content in multiple bit rates in a single file in order to provide scalability in streaming. Then the player and Windows Media Services (WMS) cooperate to determine the highest data rate version of the video and audio that will fit within the available bandwidth.

Intelligent Streaming is only available when doing real-time streaming from WMS with Windows 2003 Server. Files that won't be coming off WMS should not be encoded with Intelligent Streaming, since it'll just make the files larger, but won't provide any added functionality. Also, not all clients support Intelligent Streaming. It works best when targeting WMP 9 or higher running on Windows.

Tuning for the Target Player

Windows Media Player 11
Windows Media Player 11 is the newest version of WMP. It's available as a free download for Windows XP and built into Windows Vista. With WMP, you can safely use both WMV 9 Advanced profile and WMA 10 Pro's LBR modes.

Windows Media Players 9 and 10
WMP 9 is a good baseline target for mass-market WMV delivery today. WMP 9 is preinstalled with the widely deployed Windows XP Service Pack 2, and it is available via Windows update all the way back to Windows 98. WMP 9 established much of the modern infrastructure for Windows Media, and it's functionally equivalent to WMP 11 for almost all web video. It can safely use WMV 9 and all the WMA 9 audio codecs. WMP 10 was a fine release, but was focused more on performance and media management, and didn't add much functionality specific to web video.

Windows Media Players 6, 7, and XP
Older versions of WMP can use WMV 9 and WMA 9 using codec updates (either automatically downloaded, or via an enterprise deployment). However, none of the pre- WMP 9 versions support the full Intelligent Streaming model implemented in the current version of WMS, and hence can only switch between video tracks of the same frame size, and not at all between audio tracks. WMP 6.4 (the last version for Windows NT 4.0) also isn't reliable with VBR-encoded content.

Telestream's Flip4Mac is Microsoft's current recommended solution for Windows Media playback on the Mac.

Flip4Mac handles WMV 9 and WMA 9 and WMA 9 Pro very well, but doesn't currently support WMV 9 Advanced Profile or the other audio codecs. Telestream also offers professional versions of Flip4Mac that support importing WMV into Mac tools like Final Cut Pro and After Effects, and exporting to WMV in QuickTime Pro as well.

WMP 9.1 for Mac
While development has ended on it, WMP 9.1 for Mac is still available. While Flip4Mac provides higher performance and Intel compatibility, WMP 9.1 for Mac supports DRM and some web page embedding options that Flip4Mac does not currently support, although Telestream is working on improving embedding for future versions.

Windows Mobile
Windows Mobile has included Windows Media Player for a number of generations, and any device with Windows Mobile 2003 or later (including 5.0 and the just-released 6.0), has a very capable player, supporting WMV 9, WMA 9, Pro, and Voice. Kinoma Player
Kinoma has recently introduced a new version of the Kinoma Player for PalmOS with support for Windows Media playback. Note that it currently supports only the WMV 9 and WMA 9 codecs, without support for the WMA 9 Voice codec sometimes used for low-bit rate applications.

The open-source VLC player is adding WMV support to their forthcoming 0.8.6 release. While this product is not supported by Microsoft, it may become a useful solution for WMV playback on other platforms.

WPF/E (Codename)
The Windows Presentation Foundation/Everywhere is Microsoft's codename for a forthcoming rich media browser plug-in. It will support WMV 9 and WMA 9 on Mac and Windows, and within Internet Explorer, Firefox, and Safari. It's currently available as a Community Technology Preview. Downloads and lots of other information are available here.

Advanced Registry Key Settings There are a variety of new advanced registry key settings supported in the v11 WMV 9 and WMV 9 AP codecs that can be used to dramatically fine-tune video quality. These were covered in detail in this article. There is also a new PowerToy to edit and share those settings, available here.