Are you the publisher? Claim this channel


Embed this content in your HTML

Search



Account: (login)

More Channels


Channel Catalog


Channel Description:

A look at Internet Explorer from the inside out.

(Page 1) | 2 | newer

    Recently, someone attempted to download a deprecated version of the Windows Script debugger. This tool was used to debug scripts prior to the introduction of more powerful, modern tools like those that are built into IE8 and later. The user emailed me when they encountered a very surprising outcome:

    image

    After clicking the Run button, the download proceeds, but then the user sees the message: <filename> was reported as unsafe.

    image

    The obvious question is “Who or what reported this file as unsafe?”

    Many people might assume that this is a SmartScreen warning, but it’s not. When SmartScreen blocks a piece of known-malware, it takes all the credit, as you can see in this screenshot:

    image

    So, if it’s not SmartScreen, where is the warning coming from?

    The answer is that the “<filename> was reported as unsafe” message occurs when the WinVerifyTrust API reports that there’s a problem with a downloaded file’s digital signature. Pretty straightforward. You’d might see this error if you tried to open an incompletely-downloaded file, for instance.

    However… If you look at the signature in Windows Explorer, everything looks okay:

    image

    In particular, you’ll notice that the signature is marked OK at the top, and there’s a Timestamp at the bottom, which allows the signature to remain valid even after the certificate expires. However, that timestamp is twelve years old… this is a very old file.

    Now, if you click the View Certificate button, and you’re a regular reader of my blog, you’ll might be able to spot the problem:

    image

    See it?

    The problem is that the certificate itself is signed using the MD2 hash algorithm. This algorithm has known vulnerabilities and is not safe to use for untrusted content delivered over insecure networks. To that end, as I’ve mentioned in passing a couple of times on this blog, Internet Explorer 8+ on Windows 7 SP1+ no longer accepts Authenticode signatures that use the MD2 or MD4 hashing algorithms in the certificate chain. (MD2 and MD4 on the root itself isn’t a worry, because the root itself is installed on the machine). This restriction is enforced by passing the WTD_DISABLE_MD2_MD4 flag to WinVerifyTrust, instructing it to reject signatures that use either of these weak algorithms on the certificate chain. For the time being, Windows Explorer itself permits use of MD2 and MD4 in the signatures of locally-installed files, since the primary threat in today’s environment comes from code delivered using the Web.

    In the real world, downloads that are signed using deprecated algorithms prove to be quite uncommon, because relatively few software vendors were signing their packages in the late 1990s when MD2 and MD4 were in use. We’ve identified a few ancient installers on Microsoft’s Download Center that will need to be removed or re-signed using current certificates and newer algorithms (e.g. SHA-1) to present a stronger defense against skilled attackers that might try to spoof a digital signature.

    -Eric


    My First Law of Browser Quirks was introduced a while ago: If there’s a way for a site to take dependency on a browser quirk, and break if that quirk is removed, it will happen. The Second Law of Browser Quirks is: If there’s a way for a site to combine a set of browser quirks to yield an entirely unexpected behavior, it will happen.

    I was reminded of the Second Law a few weeks ago, when a site developer reported a bizarre behavior: namely, if they visited a page that was sent with Cache-Control: max-age=0 a few times, they’d see that the first HTTP/200 response was correctly cached, and then conditional requests would properly get HTTP/304 Not Modified responses. The developer then deleted that page from the backend, and they’d see a HTTP/404 correctly come back for the next revalidation request. However, if the developer renavigated to the page a few times quickly, the previously-delivered HTTP/200 page would sometimes be “magically resuscitated” from the local cache.

    How, they wondered, could that possibly happen?

    It was particularly strange because hitting the Refresh button would correctly show the 404 page.

    Fortunately, I had just recently updated the Fiddler Caching Inspector, which examines a HTTP response to determine how it will be cached. A quick look at the Inspector led me to realize how this HTTP/404 page had yielded the unexpected result. Beyond specifying a Cache-Control: no-cache HTTP response header, the 404 page contained the following markup:

    <meta http-equiv="Expires" content="0" />

    Seeing caching directives in HTTP Markup always makes me a bit nervous, and it turns out that this is in fact the root cause of the problem. For historical reasons, Trident will respond to two directives in HTML markup:

    <meta http-equiv="Pragma" content="no-cache" />
    <meta http-equiv="Expires" content="HTTP-DATE" />

    No other directives (e.g. Cache-Control) are supported.

    Each of the two supported directives, if encountered, causes Trident to call down to WinINET to adjust the HTTP caching freshness of the cache entry stored under the current URL. For the Pragma directive, if the current URL is HTTPS, Trident will call DeleteURLCacheEntry; if the URL isn’t HTTPS, it will instead call SetUrlCacheEntryInfo to adjust the time at which the document expires. For the Expires directive, the provided Date will simply be used as the new expiration time.

    Now, Fiddler’s Caching Inspector points out one more thing: WinINET will never cache HTTP/404 response—it will only modify the cache if it receives a HTTP/200, HTTP/206, or HTTP/3xx redirect status code. Sending a caching directive in the HTTP/404 markup was unnecessary, because IE won’t ever cache that error page. That’s not to say that the META tag does nothing, however-- Trident doesn’t know (or care) that the document was sent with an uncacheable status code, and dutifully updates the expiration info for the existing cache entry—the HTTP/200 document that had originally been stored with in an expired state.

    Half of the puzzle is solved, but why does Expires=0 result in the cache entry being deemed fresh? It’s especially surprising because if you send such an Expires directive using a HTTP header, the resulting cache entry will not be fresh.

    If WinINET downloads a response with an invalid Expires header (e.g. one that doesn’t contain a valid HTTPDATE value) and no other caching directives, it will mark the document as having expired one hour ago. Trident, however, has no such logic. If you specify an invalid time, Trident grabs the current timestamp and uses that as the expiration. Trident will also use the current timestamp if it encounters the Pragma: no-cache directive. If the user tries to re-navigate to the current document during same exact second that the HTTP/404 was processed, the incorrectly-updated expiration of the existing cache entry will result in it being treated as fresh for that request. If the user hit the Refresh button or F5, the cache would be bypassed and the 404 page would be shown.

    The solution for this website was pretty simple—either get rid of the unneeded META entirely (my recommendation) or update it to use a valid, long-ago HTTPDATE to avoid the incorrect update of the freshness info to the current timestamp. For instance, this markup:

    <meta http-equiv="Expires" content="Sat, 11 Jun 2011 01:01:01 GMT" />

    …results in the cached entry being marked as Expired.

    Various bugs have been filed from this investigation, but my advice remains that web developers should do their very best to avoid specifying caching directives in markup.

    -Eric Lawrence


  • 02/07/12--09:44: Beware Silly Similes (chan 5189999)
  • Recently, there was a blog post which described a browser security feature as "like a seat-belt that snaps when you crash."

    This wasn’t a particularly noteworthy event because similes are pretty common in our field. Almost everyone likes similes because they enable the simplification of highly technical topics into easily-conceptualized terms that anyone can understand. Similes are like JPEGs – they distill a big, complicated, intricate picture into a smaller (but still evocative) picture that bears some amount of resemblance to the original truth.

    In our headline-driven culture, where subtlety and nuance don't draw the clicks that a quick witticism can bring, a great simile will spread across the web like pollen on a spring morning.

    Of course, the downside1 is that, like JPEGs, the compression is lossy. Important information can be obliterated in the conversion process. Technical experts might observe the loss, but an everyday consumer may not detect the difference. In some cases, that’s fine, but in others it really isn’t.

    The problem with the simile in that recent blog post is that the information loss is unnecessarily high. The topic is complex, but that doesn't mean we can't use a simile—we just need to think a little harder.

    Two more accurate similes immediately leap to mind. We could describe the security feature as

    "Like a bicycle helmet that won’t prevent drowning if your boat sinks."

    Or we could say it’s

    "Like a flak jacket that can’t protect the lungs from chemical weapons."

    Both of these are accurate statements that more closely reflect reality. They convey to the everyday person the nuance that the feature doesn't protect against all attacks—specifically, not those that it's not designed to protect against.

    It's absolutely fair to wonder "Hey, why not develop a bike helmet that also acts as a life preserver?" Or, "Why not replace flak-jackets with MechWarrior-style exoskeletons that protect against a full spectrum of attacks?" These are legitimate questions, to be sure, but neither mistakenly suggests that bike helmets or flak jackets don't have their place in the world as it exists today.

    The next time you hear a great simile, think carefully about what information its originator might have left out, and what motivations they may have had when crafting it.

    -Eric

    1 Of course, if you're the one trying to convince someone to buy or adopt something, the loss of information that contradicts your point of view is considered an upside, not a downside.


    Continuing on from last year’s IE9 Minor Changes list, this post describes minor changes you can find in Internet Explorer 10 in the Windows 8 Consumer Preview.

    There are many changes that I will not be covering, please do not mistake this for a comprehensive list, and please note that I'm deliberately skipping over the big feature improvements that will be discussed on the IEBlog. Improvements in IE10 that impact issues or features previously discussed on this blog can be found by searching for the tag BetterInIE10.

    Without further ado, here are some IE10 Minor Changes:

    • Fragments are preserved when redirecting between sites, matching Firefox, Chrome, and updated standards.
    • Internet Explorer now ignores no-cache on back/forward navigations, as other browsers do and RFC2616 allows.
      • This also allows Conditional GET revalidation of no-cache resources.
      • Use no-store to prevent resource reuse in forward/back navigations.
    • XMLHttpRequest no longer fails with an “1223” error when a HTTP/204 is returned by a server.
    • XMLHttpRequest respects cache-busting flags when you use CTRL+F5 to reload a page.
    • XMLHttpRequest’s getResponseHeader method now returns a combined string containing all values in the event that multiple headers of the same name exist
    • XMLHttpRequest now supports CORS and COMET-streaming and more...
      • Note: In the Consumer Preview, there is a bug that you cannot make HTTP->HTTPS or FTP->HTTPS requests. This bug will be fixed in a future update to IE10.
    • IE10 now supports LINK REL=dns-prefetch. In IE9, we only would perform DNS prefetch for LINK REL=prefetch as the dns-prefetch token was not yet defined.
    • HTTPS Mixed Content warnings now automatically dismiss after a few seconds.
    • In IE10 Standards mode, the host and pathname DOM properties no longer return unexpected results.
    • Internet Explorer no longer lowers the Connections-Per-Host limit to 2 when a VPN or Modem connection is active. The limit stays at 6.
    • The CTRL+U keyboard combination opens the View Source window, matching other browsers.
    • CTRL+SHIFT+B toggles display of the Favorites bar
    • PNG gAMA chunks are supported on Windows 8.
    • The application/JSON MIME type is now registered, preventing IE from treating it as an unknown type (which leads to MIME-sniffing).
    • Download Manager now properly reports download speed properly even when server fails to return a Content-Length header.
      • In IE9, the download speed would be reported as slower and slower in such a case, even if the download proceeded at a constant rate.
    • SCRIPT DEFER behavior is fixed and now spec-compliant. (async is also supported)
    • Stylesheet limits dramatically increased.
    • For interop, the BASE element now supports relative paths.
      • Still, I beg of you, don’t try using a relative URL in your BASE element and avoid using the BASE element if you can possibly avoid it.
    • Prior to IE10, if a form's name/id conflicted with the name of a window's built-in property, the script reference would resolve to the form element. That no longer occurs.
    • On Windows 7 and below, if you attempted to ShellExecute a URL that used a custom protocol scheme, ShellExecute would first check to see if the protocol scheme was a registered Application Protocol. If not, it would check to see if the scheme was an IE-registered Asynchronous Pluggable Protocol. If so, IE would be invoked and passed the URL. In Windows 8, the fallback to check for an asynchronous pluggable protocol no longer occurs, so the browser will not launch unless the target protocol is a registered application protocol. To that end, the RES:// protocol is now a registered application protocol.
    • The option Use UTF8 for Mailto Links was removed from INETCPL. IE will now always pass Unicode to the ShellExecute API.
    • The availability of Application Protocols can now be detected from JavaScript using the msProtocols collection.
    • The window.navigator.appMinorVersion string is set to "BETA" to allow detecting this pre-release version of IE10.
    • The Do not save encrypted pages to disk option in the Internet Control Panel's Advanced Tab now behaves differently. Instead of trying to prevent HTTPS resources from being saved to disk, the option will delete cached-from-HTTPS resources from the cache when the browser is closed.
    • GIF Animation Frame-Rate limit increased; values as low as 20ms supported; lower values are pushed to 100ms.
    • IETldList.xml was removed; the Top Level Domain list is now statically compiled into a function in iertutil.dll.
    • In IE9, the ActiveX Filtering feature would block all ActiveX controls, even MSXML's controls like the ActiveX version of the XMLHTTPRequest object. In IE10, the ActiveX Filtering feature instead blocks all ActiveX controls that are not considered a part of the Web Platform.
      • The Web Platform controls list includes the MSXML controls, the Scripting.Dictionary object, and the HTC, XML, SVG, and XHTML objects.
      • These Web Platform controls are the only controls allowed in Metro IE.
      • This ActiveX Filtering change means that it is much less of a hassle to run with ActiveX Filtering enabled, because even legacy sites using bad patterns (like using the ActiveX XHR object in preference to the native version) still work correctly.

    You should read about the preview over on the IEBlog and in the developers guide, and try out the TestDrive Demos. You should also check out the IE10 Compatibility Cookbook to learn more about features being obsoleted or changed in IE10. Lastly, see the IE10 Web Platform Features post for a deeper look at web platform APIs.

     

    -Eric


  • 03/21/12--15:45: Mind Your Parameters (chan 5189999)
  • A recent blog post reminded me that I should blog about a bad pattern we saw a few months back while trying to fix some application compatibility bugs with IE10. It turns out that a lot of applications that want to invoke a webpage call ShellExecute without reading the documentation for the parameters of that function.

       HINSTANCE ShellExecute(
      __in_opt  HWND hwnd,
      __in_opt  LPCTSTR lpOperation,
      __in      LPCTSTR lpFile,
      __in_opt  LPCTSTR lpParameters,
      __in_opt  LPCTSTR lpDirectory,
      __in      INT nShowCmd
    );

    A hastily-written program will do something like

       ShellExecute(0, NULL, http://example.com/whatever, NULL, NULL, NULL, NULL);

    A key problem is that last parameter. nShowCmd is any of the show commands, and NULL/0 map to SW_HIDE.

    Now, in practice this often didn't matter in the past because IE was usually invoked via DDE, and the SW_HIDE would be ignored. In the IE10 Developer Preview, IE switched over to using COM for invocation, and the SW_HIDE parameter would result in a hidden IE instance opening, just as the caller had requested... but not what the developer probably expected.

    -Eric


    Last week, Andy Zeigler announced the introduction of Enhanced Protected Mode (EPM) over on the IEBlog. In today’s post, I’d like to provide further technical details about EPM to help security researchers, IT professionals, enthusiasts, and developers better understand how this feature works and what impact it may have on scenarios they care about.

    Internet Explorer’s Process Model and Bitness

    For the past several releases, Internet Explorer has sported a multi-process architecture, where the “Frame” or “Manager” process runs at Medium Integrity and the “Tab” or “Content” processes run at either Low Integrity (Protected Mode) or Medium Integrity (for Zones where Protected Mode is disabled, like Intranet sites). All HTML content and ActiveX controls run in the Content Process. Even toolbars, which visually appear as if they’re in the Manager Process, really run down in a Content Process.

    For IE10, we’ve changed IE such that Manager Processes always run as 64bit processes when running on a 64bit processor running a 64bit version of Windows. This improves security among other things. We do not expect that this change will meaningfully impact compatibility, because the Manager Process is designed not to run 3rd party content, and thus there’s little opportunity for anyone to take a dependency upon the Frame Process’ bitness. In support of this change, the various registry points that point to Internet Explorer have been updated to point to C:\Program Files\Internet Explorer\iexplore.exe. If you manually invoke C:\Program Files (x86)\Internet Explorer\iexplore.exe, that 32bit process will simply launch the 64bit version of iexplore.exe (with the appropriate command line parameters) before exiting.

    For the Content Processes, the story is a little more complicated. In the Metro-style experience of Internet Explorer, all Content Processes will run at 64bit (on Win64), which means that they benefit from the improved security provided in 64bit. The compatibility impact is minimal because Metro-style IE does not load any browser add-ons (Toolbars, BHOs, and non-browser-platform COM objects like MIME Handlers, URLMon Protocol Handlers, MIME Filters, ActiveX controls, etc). Back in IE9, running in 64bit mode meant that JavaScript was not JIT-compiled, but for IE10, the JIT compiler was enhanced to work for both 32bit and 64bit tabs, providing great performance in both. Additionally, many major browser add-ons like Flash, Silverlight, and Java are now available in 64bit versions.

    In Internet Explorer on the Desktop, by default, Content Processes remain at 32bit by default for compatibility with 32bit ActiveX controls, Toolbars, BHOs, etc. Even when you directly launch the 64bit iexplore.exe executable, you will still have a 64bit Manager Process that hosts only 32bit Content Processes. If you want to enable 64bit Content Processes for the Desktop, you must tick the Enable Enhanced Protected Mode option in the Security section of Internet Explorer’s Tools > Internet Options > Advanced tab. When this option is enabled, all Content Processes that are running in Protected Mode (e.g. Internet Zone and Restricted Zone, by default) will begin to use 64bit Content Processes.

    Note: In the Windows 8 Release Preview, if you enable Protected Mode for the Local Intranet and Trusted Zones, even if you enable EPM, the Intranet and Trusted Zones will run in 32bit LowIL rather than a 64bit AppContainer.

    enableepm

    In the upcoming Internet Explorer 10 on Windows 7 and Windows Server 2008R2, the only thing that enabling Enhanced Protected Mode does is turn on 64bit Content Processes. But, when running on Windows 8, the EPM option provides even more security by also causing the sandboxed Content Process to run in a new process isolation feature called “AppContainer.”

    Intro to AppContainer

    Windows Vista introduced the concept of Integrity Levels. The default integrity levels used by applications (Low / Medium / High) constrained what parts of the system could be written (e.g. registry keys, files, etc) and how applications could communicate or share data. Notably, in most circumstances, Integrity Levels were “Allow Read-Up; Block Write-Up” meaning that even a Low Integrity process like an IE tab would have full read-access to the rest of the disk and registry even those locations which were marked as Medium or High integrity.

    Windows 8 introduces a new process isolation mechanism, called AppContainer, that offers more fine-grained security permissions and which blocks Write and Read Access to most of the system. There’s not a lot of documentation specifically about AppContainer because all Metro-style applications run in AppContainers, so most of the documentation is written from that point of view. For instance, here’s a page that describes the capabilities that a Metro-style application can declare that it needs: http://msdn.microsoft.com/en-us/library/windows/apps/hh464936.aspx. Under the covers, it’s the AppContainer that helps ensure that an App does not have access to capabilities that it hasn’t declared and been granted by the user.

    IE Tabs and AppContainer

    Tabs running in Enhanced Protected Mode on Windows 8 run inside an AppContainer. On Windows 7 and Windows Server 2008 R2, AppContainer does not exist, so EPM only enables 64bit tabs on a 64bit OS. (That also means that enabling EPM on a 32bit Windows 7 system doesn’t do anything, because a 32bit Windows 7 system supports neither 64bit nor AppContainer).

    On Windows 8, Metro-style IE’s tabs in the Internet and Restricted Zone run in Enhanced Protected mode, while tabs in other zones run in 64bit only. You cannot disable EPM for Metro-style IE except by turning off Protected Mode entirely.

    By default, Desktop IE’s tabs run in the Low Integrity Protected Mode at 32bit. Only if you enable Enhanced Protected Mode using the Internet Options control panel will Desktop IE’s tabs run in AppContainer (and 64bit, if available).

    IE’s AppContainer

    Internet Explorer’s EPM-mode tabs run in an AppContainer named windows_ie_ac_001. In the Windows 8 Consumer Preview release, this container declares the capabilities internetClient, location, and sharedUserCertificates.

    Notably, the container does not specify internetClientServer, privateNetworkClientServer, enterpriseAuthentication, or any of the *Library capabilities, which means that Internet content runs in a tightly-limited process.

    AppContainer - Network Restrictions

    AppContainer introduces three key restrictions related to Network Connectivity that impact EPM. I’ll describe each.

    Acting as a Network Server is Blocked

    Because EPM’s AppContainer does not have the internetClientServer capability, there’s no way for an EPM process to accept inbound connection attempts from the network. Typically, such connections weren’t possible in the Web Platform anyway (e.g. there's no JavaScript method to listen() on a new TCP/IP socket), but some browser add-ons had the capability of allowing inbound connections (even though this became pretty uncommon with the broadscale deployment of firewalls). When EPM is enabled, such add-ons will not be able to accept remote connections.

    Loopback-blocked

    Apps running in AppContainer are not allowed to make connections to locally-running processes outside of their own package. This means, for instance, if you run a local developer instance of Apache or IIS on your own computer, you will find that Metro-style applications are unable to connect to that server. This also means that by-default, you cannot use Fiddler to debug Metro-style applications, because Fiddler acts as a proxy server on your local computer. To unblock Fiddler users, I’ve published a simple utility that allows users to remove the Loopback Restriction on the AppContainers of their choice; you can also use this utility to allow your App or MetroIE to contact a locally-running web server for development purposes.

    image

    Please note that Windows Store-delivered applications will not be permitted to set a loopback exemption for themselves, so this is only useful for test/development purposes.

    Now, one key thing to understand about Loopback connections in Metro-style Internet Explorer is that the Hostname you use in your URL matters a lot! If you try to navigate to http://127.0.0.1/, your page will be treated as an Internet Zone and thus will run in an EPM tab, and the loading of the page will be blocked by the AppContainer’s Loopback-block-- you’ll see a Page Could Not Be Displayed error page.

    However, if you instead try the URL http://localhost/ (assuming your Intranet Zone is enabled), you will find that Internet Explorer considers your content to be Local Intranet Zone, and thus it is loaded in a Medium Integrity (non-Protected Mode) tab. The page will successfully load since it is not running in EPM, and thus isn't blocked by the network restrictions provided by AppContainer.

    Private Network resources

    Because EPM does not declare the privateNetworkClientServer capability, your Intranet resources are protected from many types of cross-zone attacks (usually called “Cross-Site-Request-Forgery (CSRF)” and “Intranet Port Scanning.”) Internet pages are not able to frame Intranet pages, load images or resources from them, send them CORS XHR requests, etc.

    However, it’s important to understand how this restriction functions, because it can have some very surprising outcomes depending on how your Internet Explorer Security Zones are configured.

    For instance, many of us have a home router with a configuration UI accessible at http://192.168.1.1 or a similar address that is not globally-routable. On one hand, it’s desirable to prevent Internet content from sending requests to such addresses to help block CSRF-attacks that might maliciously reconfigure poorly-secured routers. However, for historical and other reasons, Security Zones consider this dotted hostname to be an Internet-Zone address by default, which means that if you attempt to navigate to the Router configuration page in Metro-style IE, you may encounter a Page Cannot Be Displayed error page. If you enable EPM in the Desktop mode of the browser, you can use the F12 Developer tools to see why the request was blocked:

    EPMOnFails

    Note: The next update to IE10 will use a more specific error message here; this string was designed for developers of Metro-style applications, not for folks debugging in EPM in IE.

    To resolve this issue, you can either use a non-dotted hostname for your router (e.g. my DNS points http://router to 192.168.1.1) or you can manually add the router’s address to your Trusted Sites zone using the Tools > Internet Options > Security | Trusted | Sites... list. When navigating to Trusted Sites, the navigation occurs outside of Protected Mode, so AppContainer restrictions are not a problem.

    There’s a non-obvious subtlety here which bears mentioning. When I personally tried to reproduce this restriction at home, I had no problem in navigating straight to the router’s IP Address in both Metro and Desktop IE with EPM enabled:

    EPMOff

    What’s up with that?

    The explanation is that the AppContainer network restrictions are sensitive to your network configuration. When I had originally connected to my router, I had selected the following configuration:

    MarkPublic

    As a result, the Windows Firewall considered my router part of a public network:

    LinkSysIsPub

    …and thus AppContainers are freely able to contact the 192.168.1.1 address as I had indicated that I was on a “Public Network” and thus the privateNetworkClientServer capability is not required to contact local / non-routable addresses like 192.168.1.1.

    I can enable the network restriction by reconfiguring my network settings. First, I use the sidebar's context menu to tell Windows to “forget” my Linksys connection. Then, I re-established it as a “home” network:

    MarkPrivate

    This causes the Windows Firewall consider this a “Private network”:

    LinksysPriv

    ...and subsequently block connections to "local" addresses from AppContainers that lack the privateNetworkClientServer capability.

    AppContainer – Isolation of Cookies and Cache

    AppContainers do not have read or write access to files outside of their container, which means that the cache, cookies, and other web-platform state information is not shared between different AppContainers and the rest of the system. This means, for instance, that if you have a Windows Web App (a Metro-style app written in HTML+JavaScript), that application will not share cookies or cache files with Internet Explorer. Similarly, Metro-style apps will not share cookies and cache with one another.

    This “partitioning” can be great for security and privacy, because it means that your use of one application isn’t visible to another. For instance, if you log into your Banking App, the banking app’s cache, cookies, and credentials aren’t available to be stolen from pages you browse in Metro-style Internet Explorer, even if a vulnerability was discovered that allowed an attacker to run arbitrary native code in the AppContainer.

    However, partitioning can lead to unexpected behaviors. I describe some of these in a previous post called Beware Cookie Sharing in Cross-Zone Scenarios. In that post, I observed that even in IE7 to IE9, there exists a partition between sites that run in Medium Integrity vs. those that run in Protected Mode, such that cookies are not shared between those modes. That can lead to problems when a site in one zone frames another, since the sandbox in which all frames in a page run is determined by the top-level page’s Zone.

    In Windows 8, the existing Medium IL / Low IL partition remains, and a new EPM AppContainer partition is added as well. It’s now possible for a user to have three independent copies of a cookie for a single site in IE (not even counting other non-IE Metro Apps). For instance, if www.example.com tries to set a cookie when it’s the subframe of an Intranet top-level page, that cookie will go in the MediumIL cookie jar. If the user then visits www.example.com in Metro-style IE, the cookie will be set in the EPM’s AppContainer cookie jar. Then, if the user visits www.example.com in Desktop IE, the cookie will be set in the LowIL cookie jar. These three cookies are independent, and changes or deletions of the cookie in one partition will not be seen in the other partitions. If the user "logs out" in one mode of the browser (which deletes the cookie) the other modes of the browser will remain "logged in" (since their cookies are isolated). Sites that need to securely log a user out across all browser modes should continue to expire the session on the server, rather than only relying on the client to stop sending a given cookie.

    To be explicit, the following data stores are partitioned between Internet Zone content running in Metro-style IE (in EPM) and Desktop IE (in LowIL):

    In contrast, Local Intranet Zone and Trusted Zone pages run in Medium IL in both Metro-style IE and Desktop IE, and thus these Zones' data stores are shared between both browser modes.

    Cookie Pushing

    One exception exists to the partitioning behavior described above. When you use the View on the Desktop command in Metro-style IE, it will "push" the current tab’s session cookies into the new Desktop IE instance that opens. However, this only applies to session cookies and not persistent cookies.

    You can see how this works by following these steps:

    1. Clear all cookies using Delete Browser History
    2. Visit www.facebook.com in Metro-style IE
    3. Log in with the Keep me logged in box unchecked on the Facebook site
    4. Facebook will send you a session cookie containing your credentials.
    5. Invoke the View on the Desktop command

    At this point, you should find that Desktop IE shows your default post-logon Facebook page (e.g. your Wall)-- you're still logged in.

    Now close your browsers and repeat these steps, except at step #3, check the Keep me logged in option. At Step #4, Facebook will send you a persistent cookie with your credentials. When you switch to Desktop IE at step #5, you will find that you are not logged in to Facebook, because the persistent cookie set by Facebook isn’t pushed to Desktop IE.

    You will further notice that if you enable Enhanced Protected Mode for Desktop IE, when switching from Metro IE to Desktop IE you will remain logged into Facebook in Desktop, because MetroIE in EPM shares cookies with DesktopIE in EPM since they are both running in the same AppContainer.

    Add-ons in Enhanced Protected Mode

    Metro-style Internet Explorer does not load add-ons, so there are no AppContainer considerations to worry about in MetroIE.

    In contrast, most users expect add-ons to work in Desktop IE, but very few add-ons are AppContainer-compatible today. If you enable EPM in the desktop and have a BHO or Toolbar that isn’t EPM compatible, the add-on will be disabled:

    BingBar

    If you visit a page that requires an ActiveX control which is not EPM-compatible, you’ll be provided the opportunity to load the page in a special “Low IL Compat” tab that runs the page at 32bits in LowIL instead of in an 64-bit AppContainer:

    Notification message which reads “This webpage wants to run 'Adobe Flash Player 10.3 d162'. If you trust this site, you can disable Enhanced Protected Mode for this site to run the control.” The notication bar contains one button labeled “Disable”.

    In order to be EPM-compatible, Toolbars and BHOs must be available in 32bit and 64bit flavors, to avoid toolbars or other UI appearing and disappearing as you navigate between zones that run at different bitnesses. To load in EPM on Windows 8, the add-on must also indicate that it is compatible with the AppContainer isolation feature by registering with a COM Component Category that indicates that the component was designed and tested to ensure it runs correctly in the no-read-up process.

    The category is named CATID_AppContainerCompatible and its GUID is {59fb2056-d625-48d0-a944-1a85b5ab2640}. C++ programmers may use:

      DEFINE_GUID(CATID_AppContainerCompatible, 0x59fb2056,0xd625,0x48d0,0xa9,0x44,0x1a,0x85,0xb5,0xab,0x26,0x40);

    Any non-trivial add-on is likely to find that it needs access to resources that are not available from within an AppContainer. The way to security provide such access is to build a broker object that runs at Medium IL. In Vista and later, brokers were needed to write protected resources, and in EPM, they are required to read protected resources.  The general pattern is:

    1. Untrusted code (the add-on running in the Protected Mode tab) calls a method in the broker, passing aero or more arguments.
    2. The broker evaluates the request's arguments and its own security policy.
    3. The broker confirms with the user that the requested operation is acceptable (e.g. by showing a Save prompt or whatever).
    4. The broker undertakes the operation if allowed, or blocks it if denied.

    Writing a broker is a significant undertaking, and requires a thorough security audit to ensure that the broker doesn’t allow malicious code to escape from the tab running in Protected Mode.

    -Eric

    PS: Please see this post for discussion of the impact of EPM on loading of local files that contain a Mark-of-the-Web.


    Recently, the IESG approved publication of a new Internet-Draft defining the HTTP/308 status code (Intended Status: Experimental). This status code is defined as the "Permanent" variant of the existing HTTP/307 status code. Recall that HTTP/307 was defined back in 1999 to remove the ambiguity around the HTTP/301 and HTTP/302 redirection codes, for which many user-agents would change the redirected request's method from POST to GET. HTTP/303 was defined to unambiguously indicate that the UA should change the method to GET, while HTTP/307 was defined to unambiguously indicate that the UA should preserve the current method.

    HTTP/308 is pretty much the same as HTTP/307, with two defined deltas:

    1. The status is defined as a "Permanent Redirect".
    2. A client with "link editing" capabilities "ought" to automatically re-link references to the target URL.

    These basic properties are inherited from HTTP/301, although there's some subtlety here you might miss. First, despite being defined as permanent, caches explicitly MAY use heuristics to expire the redirection (making "permanence" a bit squishy). That's probably not a bad thing, considering that we’ve found that many real-world sites aren't using HTTP/301 properly, and if a cache truly treats the redirect as permanent, such sites end up in infinite redirect loops. Secondly, I've never seen any client with "link editing" capabilities that automatically updates URIs based on 301. That's most likely due to the security implications of doing so, along with the fact that 301 is often misused as a temporary redirect. We're not terribly likely to see automatic "link editing" based on HTTP/308 responses either.

    Despite its practical equivalence to HTTP/307, HTTP/308 has one unique property which makes it very interesting:

    As an entirely new status code, no client implemented in the first 21 years of HTTP’s existence supports HTTP/308.

    This characteristic is intriguing, because it means that only bleeding-edge browsers (e.g. the latest Firefox nightly) work with the HTTP/308 status. If any of the existing billions of existing web clients encounters a HTTP/308, the client will not perform a same-method redirect as called for in the Internet-Draft.

    A server could use User-Agent sniffing to send back a 307 instead of a 308 to a client if it's not on the bleeding edge of standards support; such sniffing is necessary because there's nothing in HTTP request headers that indicates what response codes the client supports. Notably, the draft suggests that the server avoid the use of sniffing to send the 308 only to clients that support it, noting:

    Server implementers are advised not to vary the status code based on characteristics of the request, such as the User-Agent header field ("User-Agent Sniffing") -- doing so usually results in both hard to maintain and hard to debug code and would also require special attention to caching

    Of course, there are some situations where the server knows a priori that the client supports HTTP/308... for instance, in closed environments where the user is forced to use a particular client, HTTP/308 can reliably deliver its function.

    Most clients, however, will simply render the body of the HTTP/308 response as a webpage, which is the "fallback" behavior for unknown 3xx response codes. Unfortunately, if the 308 was being used to redirect a POST, the fallback script or markup in the body will generally not be able to perform the POST to the new target URL, because the POST body is not available to the page... unless the body of the 308 is delivered with a form echoing back the originally-submitted POST body.

    So instead, the HTTP/308 response body instead serves as a great way to advertise to the user that their browser isn't at the bleeding-edge of Internet Standards. Such a page could point the user to the nightly build of a browser which does support HTTP/308. For users on legacy devices that are not readily updatable to such browsers (e.g. phones, consoles, tablets after the end of their supported lifetime) links to web stores can be provided to offer the user the ability to purchase devices that support the latest standards experiments.

    Fiddler users can simulate HTTP/308 behavior even with non-hip clients by clicking Rules > Customize Rules. Inside the OnPeekAtResponseHeaders method, add the following block:

      if (oSession.responseCode == 308) {
          oSession.responseCode = 307;
          oSession["ui-backcolor"] = "teal";
      }

    You can test your browser’s support for HTTP/308 on this test page: http://webdbg.com/test/308/.

    -Eric


    In Part 1 of this series, I described how Same Origin Policy prevents web content delivered from one origin from reading content from another origin. (If you haven’t read that post yet, please do start there.)

    In today’s post, we’ll look at what restrictions are placed on writing between origins.

    What is a “Write”?

    For the purposes of this post, to “write” means to send content from one origin to another. Writes could take any of the following forms:

    • Navigating to a URL (especially with a query string parameter)
    • Uploading a file or performing a HTTP POST using a web form, XMLHTTPRequest, or XDomainRequest 
    • Manipulating a property of a frame
    • Writing content to a frame’s document or manipulating a DOM object in that document
    • Sending a message to another frame using postMessage

    Some forms of cross-origin write are permitted in the Same Origin Policy, and others are not.

    Why is Cross-Origin Writing Ever Permitted?

    Given the significant restrictions imposed by Same Origin Policy on cross-origin reads, it may be surprising that SOP allows cross-origin writes at all. However, consider what the web would look like without cross-origin writes—every website would act as an isolated sandbox, with no way to send data to other sites and services. Most “mashups” would be impossible, because each web site would run within a “silo,” able to communicate only with its own origin.

    While prevention of cross-origin writes would mitigate entire classes of web security vulnerabilities (CSRF, XSS, etc), the web is a far richer platform because some forms of Cross-Origin Writes are permitted in most cases. Of course, the possibility of receiving a malicious request from any attacker presents a significant defensive burden on a server and the documents it serves—every write must be scrutinized closely to prevent malicious input from corrupting the recipient.

    Attacks

    Generally speaking, any web page may navigate to any other, passing whatever URL path, query string, and fragment it wants to. This flexibility is key to the openness of the web, but it’s also the source of many security bugs. Careless site developers frequently fail to validate the input received in a URL or a POST body, and either store that information or echo it back to the client application. This can result in a cross-site scripting (XSS) vulnerability, because that malicious input could contain script or other active content that would run in the security context of the server that replayed or echoed the input.

    Another common and dangerous method of attack based on cross-origin writes is called a Cross Site Request Forgery (CSRF, pronounced “Sea Surf”) attack. CSRF attacks rely on the fact that web browsers will present cookies and authentication information to the servers they are communicating with without regard to the context of that communication. The presence of the victim user’s cookies and credentials (sometimes called “ambient authority”) typically means that, as far as the server is concerned, any request from the user’s browser is treated as legitimate. An attacker exploits the user’s ambient authority by crafting a malicious page that causes the user’s browser to issue (usually invisible) attacker-crafted requests to a cross-origin victim server (e.g. a bank, social network, or store). The recipient server often cannot distinguish a request which was intended by the user (e.g. “buy this book”, “subscribe to that person’s postings”) from a request that the user’s browser sent under the direction of markup on a malicious site.

    For instance, in this recent exploit, the user’s browser is tricked into making requests to Amazon’s servers using hidden IFRAMEs or IMG tags. Amazon, upon receiving the user’s cookie, associates the requests with the victim user, and subsequently associates the requested pages and images with that user's shopping history. When the user later visits Amazon, the store’s recommendations are used to suggest items that Amazon has related to the CSRF-generated history.

    Cross-origin requests can be used to launch password-guessing attacks (e.g. by tricking the user’s browser into sending thousands of requests using a dictionary of common passwords), to determine the user’s browsing history (by profiling cache timings), and to egress information stolen from pages where limited (non-script) injections were possible (e.g. using CSS selectors to invoke url() requests).

    Simpler exploits are possible as well—a site might blindly trust the value of a URL parameter (e.g. IsAdmin=True) and expose itself to abuse.

    Restrictions

    Perhaps the first ever same-origin restriction is that a DOM is not allowed to access most objects in another DOM unless the two DOMs were served from the same fully-qualified domain. One relaxation to that rule (allowing two FQDNs that share a common private domain via mutual opt-in of setting the document.domain property) has been a source of myriad problems over the years.

    While the web platform exposed a ton of cross-origin write capability, as browser developers and standards authors realized the dangers such capabilities posed, they “froze” most such capabilities and required explicit opt-in for relaxing the rules beyond the capabilities already irrevocably baked into the platform.

    A great example of such restrictions is the development of “Cross Origin Resource Sharing” (CORS) mechanisms like the XDomainRequest and updated XMLHTTPRequest objects. These objects impose significant restrictions to try to limit the risk of their cross-domain functionality. For instance, the XDomainRequest object was designed to never emit credentials and allows only a limited set of methods and headers; the goal was to match the capabilities of HTML4 FORMs. By limiting XDR to what could be emitted from a HTML Form, we could ensure that we were not opening up new CSRF vectors with the object.

    Similarly, when the CORS XMLHTTPRequest object adds anything but a very limited set of headers to a request, that request becomes a non-simple request which the server to respond affirmatively to a “preflight” request to confirm that it expects such headers. Similarly, if non-simple methods other than GET, POST, and HEAD are used, a preflight confirmation is required. In this way, the potential for cross-origin abuse can be limited. Unfortunately, the risk isn’t entirely erased; some web applications can still be abused even with the CORS restrictions in place. 

    Generally speaking, cross-origin requests should not be permitted to submit methods other than GET, POST, and HEAD, to send custom headers, or to issue POSTs with Content-Types other than application/x-www-form-urlencoded, multipart/form-data, or text/plain. Where possible, requests should not be permitted from the Internet to the Intranet or local computer, and the automatic transmission of cookies or credentials should be avoided if at all possible.

    Defenses

    Depending on the browser, certain types of cross-origin navigations may not be permitted. For instance, Internet Explorer has a feature called Zone Elevation Protection that prevents navigation between sites of certain security zones (e.g. Internet-Zone pages cannot load Local Machine Zone resources). Internet Explorer also recently blocked navigation to file:// URIs from Internet-Zone pages, and IE10 on Windows 8 blocks loading of Private Network resources from the Internet Zone as well.

    Internet Explorer’s P3P mechanism restricts which cookies may be replayed from a 3rd-party context; this is designed to help protect the user’s privacy, but a clever web developer could use this mechanism to help protect itself cross-origin requests bearing the user’s cookies. (For the last few years, I’ve been pondering whether it makes sense to offer a FirstParty attribute on cookies that prohibits the cookie from ever being sent in a 3rd party context; this would likely be simpler for other browsers to implement than a full P3P engine, and it would ensure that even users with ridiculously lax P3P preferences still benefit from the protection.)

    The HTML5 postMessage feature was explicitly designed with cross-document cross-origin communications in mind. The targetOrigin parameter allows the caller to specify that only an expected origin may receive its message. Similarly, the recipient may check the event’s origin property to ensure that the sender is a trusted source.

    Websites can protect themselves against cross-origin writes by carefully validating input data. For instance, when collecting information that will be stored or echoed, that information should be scrubbed for XSS and SQL injection attacks. Some sites check the request's Referer header, but should never use that as their only defense (since buggy plugins or features might allow an attacker to submit a forged origin header, and some security software strips all Referers). Many sites implement CSRF Tokens, whereby they reject requests that do not bear a one-time token supplied by the server; the token is kept secret by the Deny Read aspect of Same Origin Policy that we covered back in Part 1 of this series.

    I'll conclude this series with Part 3: Allow Execute ... hopefully, it will take me less than 32 months to write that one. :-)

    -Eric


    When the Internet Explorer team first introduced the Search Box next to the address bar in IE7, we also introduced an easy way for users to install search engines offered by websites that they visit. Users who want to add a site's search engine to the browser's search box can do so with just two clicks.

    Building a Search Provider XML file is pretty easy for web developers, but for fun, I put together a little Internet Explorer Web Search Provider Builder tool that generates the XML file based on a simple web form. A few months after I built this tool, a new PM on the IE team was tasked with taking my sloppy ASP code and making it into a formally supported feature, posted on the official IE Search Guide page.

    Unfortunately, the official Search Guide's builder feature was removed in a site redesign late last year (it wasn't very commonly used). Folks who want an easy way to build a search provider may simply use the original tool, found here: http://www.enhanceie.com/ie/searchbuilder.asp.

     

    -Eric

    <personal anecdote> The new PM referenced above became my wife just over three years later. :-) </personal anecdote>


  • 05/05/12--07:38: Use IMG tags only for Images (chan 5189999)
  • First, a bit of background.

    When web developers are optimizing the performance of their sites, often they try to use their homepage to pre-cache resources that will be used on later pages. They might do so by kicking off "pre-fetch" resource downloads after the content required by the homepage itself has downloaded. It turns out that some sites attempte to use IMG tags for pre-fetching purposes-- IMGs seem like an ideal choice because they are not limited to same-origin requests, and if you prefetch a stylesheet or JavaScript file, the target rules or script will not execute when loaded into the IMG tag.

    Some especially cutting-edge sites have tried to “help” browsers’ Lookahead Downloaders by using hidden IMG tags at the very top of their response to reference resources that will be needed later in the same page. The idea is that, by doing so, the browser will get resource requests out on the wire earlier, such that when the browser’s parser reaches the SCRIPT or LINK tag that needs the resource, the request for that resource will already be well-underway. Of course, the whole point of the Lookahead is to get resource request out earlier, but the technique of using IMG tags at the very top ensures that those URLs are very early in the HTTP response, possibly long before the client receives the markup containing the SCRIPT or LINK tags.

    Unfortunately, using IMG tags to prefetch JavaScript and Stylesheets can actually slow your page down.

    I love a good mystery, so I was excited when a website owner emailed me a Fiddler capture that showed Internet Explorer was downloading a resource on a page twice, despite the fact that the second download kicked off after the first one ended, and the first download had a proper caching header that would allow the resource to be re-used. What was going on here?

    Fortunately, the capture was made with the X-Download-Initiator header enabled, so I was able to see why each request was made. The first download of the script was kicked off when the Lookahead reached an IMG tag. The second was kicked off when the parser reached a SCRIPT tag with the same SRC value. The URLs were identical-- why was the second request sent?

    A further look showed that the client had actually aborted the first request, which explains why the second request was needed. But why was the first request aborted?

    When IE encounters an IMG tag, it creates an image object and assigns the download request to it. As data arrives from the image download, it’s fed into the browser's image decoders. The decoders will reject data as malformed if you feed them plaintext, which seems reasonable, since they can't possibly make use of such data. When the decoders reject the data as "Not possibly an image," the image object will abort its processing. As a part of that abort, if the download has not yet completed, it too is aborted.

    Aborting a download can be very bad for performance. Firstly, the client won’t get the resource that it asked for—only the part of the resource downloaded before the abort will be cached, and even that portion is cached only if the response was served with an ETAG and a Content-Length. If those headers are present, the browser may be able to later download the remainder of the file using a HTTP Range request. Secondly, establishing TCP/IP connections (and possibly HTTPS handshaking on top of that) can be quite expensive, so throwing away perfectly good connections can measurably increase the load time of your page.

    I built a MeddlerScript that demonstrates this problem quite clearly. After loading the .ms file, click the Compile and View button in Meddler. Your browser will open a HTML page that pulls in two JavaScript files, one using an IMG tag and one using the SCRIPT tag. The script pulled in using the SCRIPT tag immediately executes as expected, and the script downloaded by the IMG tag, as expected, does not run. If you then click the link to the second HTML page, you’ll see that the SCRIPT-fetched script is pulled from the cache and reused, but the IMG-fetched script is re-downloaded from the server. If you look at the Log tab in Meddler, you’ll see the following information:

    Script loaded at 7:15:26.  Ready for connections.
    0: GET /;
    1: GET /796-SCRIPTPretendingToBeAImg.js;
    2: GET /796-SCRIPTNOTPretendingToBeAImg.js;
    1: Error: An established connection was aborted by the software in your host machine
    3: GET /UseTheScript.htm;
    4: GET /796-SCRIPTPretendingToBeAImg.js; Range: bytes=4237-; If-Range: "796";

    The line in yellow shows where the Script-fetched-by-IMG download was aborted, closing the connection. Later, session #4 shows that, when navigating to the second page of the repro, the browser sends a HTTP request asking for a partial download of the remainder of the script file.

    If you use Fiddler to monitor this scenario, you’ll see that it doesn’t repro unless you enable Streaming Mode. That’s because, by default, Fiddler fully buffers each response and delivers it to the client in one shot, which means that the image object won’t have the chance to abort until the download has already been completed and cached. This reiterates the fact that using IMG for pre-fetch can succeed—but only under the best of network conditions.

    The same abort behavior exists in IE6 to IE10, and Firefox 12.0. Opera 11.61, Chrome 18, and Safari 5.1.5 do not appear to abort when invalid image content is downloaded.

    Interestingly, Firefox does not appear to cache even the partial file; when it re-downloads the script, the request does not contain a Range header. That behavior might be explained if Firefox uses a separate cache for IMG requests vs. other tags (an architecture that I believe Mozilla used at some point). Further evidence pointing in this direction exists. If you update the MeddlerScript so that the first request completes so quickly that the client has no chance to abort, Firefox still re-downloads the script file on the second page of the repro.

    Script loaded at 7:31:39.  Ready for connections.
    0: GET /;
    1: GET /143-SCRIPTPretendingToBeAImg.js;
    2: GET /143-SCRIPTNOTPretendingToBeAImg.js;
    3: GET /usethescript.htm;
    4: GET /143-SCRIPTPretendingToBeAImg.js;

    When the Web Developer who encountered this problem asked me for alternatives, my first thought was to try using IE’s startDownload method, which I hoped would accommodate their scenario, even though the method is limited to Same-Origin requests. Unfortunately, it turns out that startDownload isn’t a suitable replacement, because downloads initiated by this method are conducted with a no-cache flag, such that the cache is bypassed when making the request, and the response isn’t committed to the cache.

    HTML5 proposes an explicit prefetch Link relation to allow clients to recognize resources for which pre-fetching may be beneficial. Internet Explorer 9 and 10 use these LINKs to perform DNS-prefetching; resources are not downloaded.

    -Eric

    PS: Using IMG tags to pre-fetch images is, of course, entirely fine… so long as you’re not trying to pre-fetch images from HTTP on a page delivered by HTTPS. Doing that will cause a Mixed-Content problem and your page’s lock icon will disappear.


  • 05/16/12--10:07: Please Stop Polluting (chan 5189999)
  • When I surf the web, I almost always have Fiddler running, and as a consequence I see a lot of “hidden” pollution in pages. Much of this cruft has built up over the years, copied from site to site, probably with little critical thought about its necessity.

    Please remove any META tags you have that specify MSSmartTagsPreventParsing. This directive only ever had any effect in an IE6 Beta, and the feature in question never shipped in a final version of the browser. Sending this directive is a waste of bandwidth.

    Next, META tags that specify imagetoolbar should probably be removed. This META disables the IE6-era Gallery Toolbar that showed atop images of a certain size; that toolbar was removed in IE7. Similarly, the galleryimg attribute on IMG tags is generally not needed for the same reason.

    In nearly all cases, sites that send the pre-check and post-check directives in the Cache-Control HTTP Response header should remove these directives. These directives only ever worked in Desktop IE browsers, and they almost always are sent as part of a long string of no-cache directives (PHP unnecessarily sent these directives when a page was not to be cached). Sending these directives when you’re using other directives to forbid caching is a waste of bandwidth.

    -Eric


  • 05/30/12--13:53: Brain Dump: Random Tidbits (chan 5189999)
  • This post contains random IE-related tidbits for which there’s either not enough material or time to write a full post. I expect to revisit and expand this list from time to time.

    Case-Sensitivity in Cross-Frame Scripting of File URIs

    Same-Origin-Policy controls how script running in web pages may interact with other pages. Normally, in IE, an origin consists of a protocol scheme, hostname, and zone number. Hostnames are canonicalized to lowercase. However, file URIs’ origins consist of the protocol scheme, first component of the path, and zone number. That’s because file shares are named, and different shares may have different access lists. For instance file://server/Accounting and file://server/Dev/ are considered different origins. Making matters more interesting, on some file systems, paths and share names are case-sensitive. As a consequence, the Origin of a file URI is also case-sensitive. You will find that file://server/Dev/Page1.htm can interact with file://server/Dev/Page2.htm but not file://server/dev/Page3.htm because the Origin for the first two pages is FILE:server/Dev while the origin for the third is FILE:server/dev.

    Blocking ActiveX controls in the Web Browser Control

    Hosts of the Web Browser can control how it behaves by implementing IOleClientSite and responding to DISPID_AMBIENT_DLCONTROL with a set of desired behavior flags. One important caveat: the flag DLCTL_NO_RUNACTIVEXCTLS only blocks OBJECT tags within the document itself. It has no impact on the use of calls to new ActiveXObject(“…”) from script, if DLCTL_NO_SCRIPTS was not set. In order to prevent ActiveX execution in a Web Browser host, supply an IInternetSecurityManager and return URLPOLICY_DISALLOW and S_FALSE when your ProcessURLAction implementation is called with URLACTION_ACTIVEX_RUN. To permit only a specified “allow list” of controls to run, the ProcessURLAction implementation can examine the CLSID of the requested control; that CLSID is passed in using the pContext parameter.

    FavIcons in the Windows 8 "Metro" Start Screen

    If you pin a site to the Windows 8 Start Screen, it will only show the site's FavIcon if the site supplied a 32x32 image in its .ico file. Learn more here.

    ActiveX Filtering and Zones

    In IE10, the ActiveX Filtering feature was changed to permit a small list of controls that are deemed part of the web platform. This change was undertaken both to improve the user-experience and to provide developers with a means of emulating Metro-style Internet Explorer's ActiveX restrictions in the desktop, where the F12 Developer Tools are available for debugging purposes.

    Unfortunately, this debugging strategy won't always work effectively because the ActiveX Filtering feature is disabled by default in the Local Intranet Zone. For instance, if you're loading your development site from http://localhost, it will run in the Local Intranet Zone and thus ActiveX Filtering is not applied. In order to enable ActiveX Filtering for the Intranet zone, you will need to adjust the setting Tools > Internet Options > Security > Local intranet > Custom Level > Allow ActiveX Filtering to Enable 

     

    To be continued…


  • 06/05/12--10:48: The Intranet Zone (chan 5189999)
  • Internet Explorer maps web content into one of five security zones. After the Local Machine Zone, the Local Intranet Zone is probably the most misunderstood of the Zones, and is a common source of confusion and compatibility glitches.

    Mapping into the Local Intranet Zone

    For the Trusted and Restricted Sites zones, Zone Mapping is simple. URLMon checks the URL’s origin against the URL patterns in the user’s Zone mapping table, displayed inside Tools > Internet Options > Security > [Zone] > Sites. The Local Machine Zone is a bit more complicated, but I’ve written about that before. Sites that aren’t mapped to the Local Machine, Trusted, Restricted or Local Intranet Zones will be defaulted to the Internet Zone.

    So, that leaves only the Local Intranet zone. How does the browser decide whether a resource should be mapped to the Local Intranet zone rather than defaulting to the Internet Zone? This mystery led to one of my very first investigations when I joined the Internet Explorer team.

    Some might guess that the browser simply resolves the hostname using DNS and then maps “private” IP addresses (defined in RFC1918) to the Intranet zone. While that’s a fine guess, it’s actually incorrect, and the MapUrlToZoneEx2 function doesn’t take the IP address of the target site into account at all. There are a few reasons for this, including the fact that some large organizations have public IP addresses for hosts on their “Intranet” and that URLMon needs to be able to determine the Zone of a site even if that site isn’t presently reachable or listed in DNS.

    Instead, determination that a site belongs to the Local Intranet Zone is based on a number of rules:

    1. Direct Mapping. As with other Zones, users or network admins may map a list of URL patterns into the Local Intranet Zone. This list is viewable by clicking Tools > Internet Options > Security >  Local Intranet > Sites > Advanced.
    2. The PlainHostName rule (aka "The Dot rule"). If the URI’s hostname doesn’t contain any periods (e.g. http://team/) then it is mapped to the Local Intranet Zone.
    3. The fixed Proxy Bypass list. If the user has a fixed proxy specified inside Tools > Internet Options > Connections > LAN Settings, then sites listed to bypass that proxy will be mapped to the Local Intranet zone. The fixed proxy bypass list can be found by clicking the Advanced button; it’s at the bottom of the screen in the box labeled Exceptions.
    4. (WPAD) Proxy Script. If the user’s proxy configuration is “Automatically detect settings” or “Use automatic configuration script” inside Tools > Internet Options > Connections > LAN Settings, the browser will run the FindProxyForUrl function in the specified WPAD proxy configuration script to determine which proxy should be used for each request. If the script returns "DIRECT", the browser will bypass the proxy and the site will be mapped into the Local Intranet Zone. When mapping a URL to a Zone, URLMon will call the FindProxyForUrl function to determine if the bypass rule applies. One interesting twist is that the proxy script may itself call dnsResolve to get a site’s IP address and use that information as a part of its determination.

    Rules #2, #3, and #4 can be controlled using the checkboxes found at Tools > Internet Options > Security >  Local Intranet > Sites:

    image

    The top box controls rule #2, while the second checkbox controls rules #3 and #4.

    Disabling Local Intranet Zone

    Since Internet Explorer 7, the Local Intranet Zone may be disabled in certain environments. The Local Intranet Zone is enabled by default if the current machine is domain-joined. It is also enabled if the Network Location Awareness API indicates that at least one of the connections is managed (NLA_NETWORK_MANAGED).

    In other cases, the user will see a notification upon visiting a site that would-have-been mapped to the Local Intranet Zone if that Zone had been enabled:

    image

    Or in the Metro-style browser:

    image

    If you click the “Turn on” button, you’ll see the following confirmation:

    image

    If you choose Yes, the settings inside Tools > Intranet Options > Local Intranet > Sites will adjust to:

    image

    The top box will be unchecked and the other Intranet-related options will be checked. The REG_DWORD named WarnOnIntranet under HKCU\Software\Microsoft\Windows\Current Version\Internet Settings\ will be set to 0, which means that even if you later recheck the “Automatically detect intranet network” box, you’ll never see the “Intranet settings are off by default” notification bar.

    AppContainers, the Firewall and the Local Intranet Zone

    Windows 8 introduced a new process isolation technology called AppContainer, which is used by the new Enhanced Protected Mode feature enabled by default in Metro-style Internet Explorer. AppContainers utilize the Windows Firewall to enforce restrictions on what network addresses may be contacted by code running within the AppContainer. If an AppContainer has the InternetClient capability, then outbound network connections may be made to the Internet. If an AppContainer has the PrivateNetworkClientServer capability, then network connections may be made to "private network" addresses. What is considered a "private network" is configurable by Group Policy / Network Administrators, but for many users this will include the RFC1918 addresses.

    The AppContainer used by Enhanced Protected Mode has only the InternetClient capability and explicitly does not have the PrivateNetworkClientServer capability. This has some interesting implications.

    Say you're a home user on a non-domain-joined PC, and you've got a wireless router running at 192.168.1.1, with the hostname "DDWRT". When you configured your wireless connection in Windows, you chose the option to allow connection to devices, which makes the 192.168.* address range a part of your "private network." Now, in Metro-style IE, try to open http://192.168.1.1. Observe that you see a "Page could not be displayed" error, because the target hostname was mapped to the Internet Zone (by the "dot rule"), and thus loaded in an Enhanced Protected Mode tab. That EPM tab runs in an AppContainer without permission to contact the private network, and thus the connection fails. Now, try loading http://ddwrt/. You will see the "Page could not be displayed" error, but also the "Intranet settings are disabled by default" error message. Now, click the "Turn on Intranet Settings" button. Observe that the page reloads successfully, because http://ddwrt/ is mapped to the newly-enabled Local Intranet Zone (by the "dot rule") and the Local Intranet Zone runs at Medium IL, outside of EPM/AppContainer.

    What’s Different in the Local Intranet Zone

    When markup runs in the Local Intranet Zone, it may behave differently than markup running in the Internet Zone. These differences include:

    • Pages will run in Compatibility View unless they explicitly specify another document mode. This option can be disabled using the Tools > Compatibility View Settings menu.
    • The default security template for the Local Intranet Zone is Medium-low. This means that many URLActions have more liberal settings than content running in the Internet Zone. In particular, the popup blocker is set to allow popups, and features like ActiveX Filtering, the XSS Filter, and SmartScreen are disabled by default. Credentials may be automatically submitted to Intranet sites using the NTLM and Negotiate protocols.
    • P3P and cookie controls do not apply to sites in the Local Intranet Zone.
    • The Local Intranet Zone runs outside of Protected Mode at Medium Integrity Level. Cache, cookies, IndexedDBs, localStorage, etc are all partitioned and not shared between processes running at MediumIL and those running in Protected Mode or Enhanced Protected Mode. This can cause a number of problems with Single-Sign-on (SSO) and Federated Authentication architectures, because cookies set at one integrity level are not shared with tabs running at another integrity level. 

    Hosting sites “at the root”

    A few years ago, ICANN voted to allow organizations to create new generic TLDs. For instance, an organization could buy the TLD “insurance“ so that users could go to say, http://contoso.insurance or https://woodgrove.insurance.

    The problem is that some purchasers of such TLDs will expect to be able to host pages “at the root,” so that, for instance, a user could visit http://insurance/ to sign up for a policy.

    See the problem?

    Because it lacks any embedded dots, this URL would be mapped to the Local Intranet zone, giving it different default behaviors and additional permissions. Additionally, the site isn’t reachable for many users—the user will end up with a “Page could not be displayed” error when they attempt to visit the site, because the user’s local DNS server will attempt to qualify the hostname and look for a non-existent record. Currently, nine country-specific TLDs attempt to host pages at the root; none of these sites can be loaded from most corporate networks.

    -Eric


    Over on the Microsoft PKI blog, there’s some important information about upcoming changes for website operators who use HTTPS or deploy Authenticode-signed applications or ActiveX controls.

    Weak RSA Keys Blocked

    To briefly summarize the PKI team’s post, a security update coming to Windows 2008, Win7, Windows Vista, Windows 2003, and Windows XP in August 2012 will treat as invalid signatures that use RSA keys that are weaker than 1024 bits. That means that a HTTPS site that uses a cert with a weaker signature in its chain, or an ActiveX control signed by such a certificate will be blocked as having an invalid signature.

    Within IE, this blockage will be visible via a notice that a executable or ActiveX control "was reported as unsafe”:

    image

    WinVerifyTrust will report a signature error for such files, because weak RSA keys do not provide a meaningful degree of protection. Regular readers will recall a similar issue when Windows began rejecting MD2 & MD4 signatures. An exception is made for ActiveX controls or Authenticode signed downloads that have a cryptographically-validated timestamp showing that they were signed prior to Jan 1, 2010.

    For HTTPS sites, the blocking experience is different. When navigating to a HTTPS page that relies upon a weak RSA key for security, navigation will be interrupted by the “Problem with security certificate” error page:

    image

    In this situation, the “continue to this website” link will be non-functional; you cannot use the browser UI to override this blockage.

    You can distinguish the source of this error from other HTTPS errors by the lack of a specific “error text” message appearing at the normal location; e.g. in the HTTPS Subject CN Mismatch error, you see the following explanatory text:

    image


    Faster Certificate Revocation

    The PKI team also announced a new improvement for helping address certificate revocation latency, which you can read about here. The highlight is that the new feature provides dynamic updates for revocation information so that Windows clients can be updated with untrusted certificates at most within a day of the information being published (no user interaction required); this updated information helps serve as a backstop for Windows’ existing certificate revocation checking support.

    -Eric


    Ordinarily, Internet Explorer loads local HTML files in the Local Machine Zone. Locally-loaded HTML files are subject to the Local Machine Lockdown feature which prevents pages from running active content like JavaScript or ActiveX controls, showing the following notification:

    image

    In order to avoid this lockdown, many local HTML pages will contain a Mark-of-the-Web (MOTW) which instructs Internet Explorer to load the content using the security permissions of a different zone, typically, the Internet Zone. There are several ways to assign a MOTW:

    1. A comment inside HTML markup
    2. Using an NTFS Alternate Data Stream named Zone.Identifier
    3. A Low Integrity Label in the permissions on the file
    4. Loading the file from the Temporary Internet Files folder

    Internet Explorer itself uses each of these methods in different circumstances, but most developers who are designing HTML to load from local locations will use the HTML comment format, by adding one line to their markup:

    <!doctype html> 
    <!-- saved from url=(0014)about:internet -->
    <html><head>...

    When Internet Explorer encounters this comment, it maps the current document into the Internet Zone, and it runs with the permissions of an Internet-based document so JavaScript and ActiveX controls may run with appropriate limits (if permitted by your Internet Zone settings).

    However, putting local content into the Internet Zone has one very important consequence in Windows 8. The Internet Zone runs in Enhanced Protected Mode (EPM) in Metro-style IE and optionally in Desktop IE. One of the key strengths of the AppContainer isolation mechanism upon which EPM relies is that AppContainers do not permit “Read up” access. That means that content running in AppContainer doesn’t have direct read access to most areas on your hard drive; attempting to open a file in an AppContainer will result in an Access Denied error. (In contrast, the IE7-IE9 Protected Mode feature allowed “Read up” access, only writes were forbidden.)

    When Internet Explorer is instructed to load the local HTML page above, it first assumes that the content will be in the Local Machine Zone and begins reading the content in a process which is running at Medium Integrity outside of AppContainer. When it encounters the MOTW, it realizes that the content should be loaded in EPM, so it launches an EPM process to take over the loading of the content. The EPM process is now “stuck”—it doesn’t have access to the local file, because the AppContainer forbids reading of the file. The solution to this conundrum is pretty simple—when Internet Explorer’s EPM encounters an Access Denied error on a local file, it asks IE’s broker process (running at Medium) to see whether that local file has a MOTW. If so, the broker process provides read access to the file, enabling the restricted process to read it. If no MOTW is found, then the read is denied and the browser will not render the content.

    Now, where this gets tricky is when a page refers to other resources, called “subdownloads.” For instance, consider the following markup:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
    <!-- saved from url=(0022)
    http://127.0.0.1:8088/ -->
    <HTML><HEAD><
    META http-equiv="Content-Type" content="text/html; charset=windows-1252">
    </HEAD>
    <BODY>This page contains several image files loaded from the same path:<BR>
    <PRE>
    GIF: <IMG src="1.gif"><br />
    PNG: <IMG src="2.png"><br />
    JPG: <IMG src="3.jpg"><br />

    <DIV id="divTime"></DIV>
    <SCRIPT>
      setInterval('document.getElementById("divTime").innerText = new Date();', 1000);
    </SCRIPT>
    </PRE></BODY></HTML>

    The page refers to three image files in the same folder as the parent markup. If these files do not have a MOTW applied to them, a local page running inside EPM will not be able to load them:

    image

    (Note: my screenshots are from Desktop IE because I’ve enabled Enhanced Protected Mode in Desktop IE using the Tools > Internet Options > Advanced > Security checkbox. EPM is enabled by default for Metro-style IE.)

    You can determine whether or not a file has a MOTW by using icacls.exe to check for a Low Mandatory Level label, or by using streams.exe to check for a NTFS Alternate Data Stream named Zone.Identifier. By default, neither will be present:

    image

    After we use icacls.exe to apply a Low Integrity label to files #1 and #3, and you can see the difference:

    image

    Now, let’s apply a MOTW to 2.png using an Alternate Data Stream:

    image

     

    Now all three images are visible:

    image

    As you can see, both the Alternate Data Stream and the Low Integrity Label can be used to mark the content as accessible from an Enhanced Protected Mode process. However, these methods can be cumbersome to apply manually, so we’ve provided one simpler mechanism local pages may use.

    If the markup file is named page3.htm, you can create a folder named page3_files and place all of the resources required by the page into that folder. Update your markup’s references to refer to the resources in that subfolder:

    <IMG src="page3_files/1.gif">
    <IMG src="page3_files/2.png">
    <IMG src="page3_files/3.jpg">

    …and you will find that the images are allowed to load, even without a MOTW applied to each. That’s because Enhanced Protected Mode automatically grants page3.htm permission to read these local files when placed in the specially-named subfolder.


    -Eric Lawrence


  • 07/13/12--10:00: Brain Dump: International Text (chan 5189999)
  • Note: The “brain dump” series is akin to what the support.microsoft.com team calls “Fast Publish” articles—namely, things that are published quickly, without the usual level of polish, triple-checking, etc. I expect that these posts will contain errors, but I also expect them to be mostly correct. I’m writing these up this way now because they’ve been in my “Important things to write about” queue for ~5 years. Alas, these topics are so broad and intricate that a proper treatment would take far more time than I have available at the moment.

    Handling of non-ASCII text is a common source of compatibility and interoperability problems. This post covers a variety of tidbits related to this topic, and it will be expanded (and likely corrected) over time.

    RFC2616 defining HTTP/1.1 suggests that non-ISO-8859-1 text in HTTP headers must be encoded according to the rules of RFC2047, an approach that was not commonly implemented by many web clients. Many clients will instead send or accept raw UTF-8 or bytes encoded using the current system’s ANSI codepage instead. Character-set mismatches often result in interoperability problems.

    Internet Explorer’s handling of non-ASCII text is partially controlled by these checkboxes in the Advanced tab:

    image

    Always show encoded addresses is disabled by default will force IE to show the raw Punycode in the address bar at all times when viewing an IDN site, even if that site’s IDN URL is following the non-spoofability rules.

    Send IDN server names is enabled by default and will force IE to encode hostnames in URLs following the rules of RFC3491 and RFC3492. The user will be shown the URL in the address bar in Unicode form if and only if the URL is deemed non-spoofable. Please see this IEBlog post on the rules of IDN Non-spoofability

    Send IDN server names for Intranet addresses is disabled by default for compatibility with legacy Windows networks that were using UTF-8 to support non-ASCII hostnames. Other browsers, to the best of my knowledge, do not have special handling for Intranet sites, and I believe that current versions of Active Directory and the Windows DNS server support punycoded hostname registration and lookup.

    Send UTF-8 URLs is checked by default, but doesn’t behave as broadly as its name implies. This option controls whether certain URL components and headers are sent and interpreted using UTF-8 or the system’s ANSI codepage, but it does not apply to the entire URL.

    Show Notification bar for encoded addresses checked by default, informs the user that they are seeing punycoded text in the address bar only because the non-spoofability rules have determined that the current site’s address follows the rules for IDN non-spoofability except that the address uses characters outside of the current user’s configured Accept-Languages. The notification bar allows the user to adjust the configured Accept-Languages using the Internet Control Panel.

    Use UTF-8 for mailto links is unchecked by default, but is checked when installing current versions of Outlook. You can learn a lot more about this option in this IEBlog post. The option has been removed for Windows 8 / Internet Explorer 10, and mailto links are always passed to the client application using %-encoded UTF-8.

    Submission of text in HTML forms in Internet Explorer is a fascinating and complex topic. The design of form encoding in IE8 and earlier was to submit forms using the encoding of the submitting page by default. If the FORM element on the page declared the Accept-Charset attribute equal to UTF-8 (which is the only supported value) and if the form results contained data that could not be encoding in the page's encoding, then the form results would be sent as UTF-8. In IE9 standards-mode and later, IE will always encode form results as UTF-8 if the accept-charset="UTF-8" attribute is present.

    If your web form contains an INPUT TYPE=HIDDEN element with the name _charset_ this field will be automatically filled with the name of the character set used to encode the form when it is submitted. This helps permit your server to decode the form using the proper encoding.

    In contrast, it’s not always possible to reliably reconstruct querystrings at the server (no, that was not a typo!), because IE does not pass any state information to the server which would indicate what encoding was used.

    URLs in IE may use up to three (!!) different encodings at once: punycode in the hostname, %-escaped UTF-8 for the path, and raw codepaged-ANSI for the query and fragment components. This is clearly a mess, but fixing it to match the IRI specification incurs compatibility costs. (Trust me, we’ve tried!)

    Internet Explorer’s XMLHTTPRequest object will not automatically encode your URIs for you (e.g. %-escaping UTF8 characters). If you want to send such characters to the server following the rules of IRI, you should encode them before passing them to the open() method, using the encodeURIComponent JavaScript API.

    If you’re downloading files to IE9+ or other modern browsers, you should use RFC5987 encoding for the Content-Disposition header. If you need to support old versions of IE, the story is more complicated. This IEInternals post explores that topic.

    In WordPad (and most RichEdit controls in Windows) you can simply type a four-digit hexadecimal number, (e.g. 30C4) and then hit ALT+X to convert that sequence to the corresponding Unicode character (i.e. ). Similarly, you can paste a Unicode character into WordPad and hit ALT+X to convert it back to its Unicode value.

    In Windows, encoding of non-ASCII characters in File-scheme URIs (e.g. file://server/path/file.txt) is different than in other schemes. %-encoded octets in a FILE uri are always interpreted using the system’s ANSI codepage, not UTF-8. Learn more about this and File URIs in general here.

    -Eric


    Back in March of 2011, I mentioned that we had encountered some sites and servers that were not sending proper Content-Length headers for their HTTP responses. As a result, we disabled our attempt to verify Content-Length for IE9.

    Unfortunately, by April, we’d found that this accommodation had led to some confusing error experiences. Incomplete executable files were not recognized by SmartScreen’s Application Reputation feature, and other signed filetypes would show “xxxx was reported as unsafe” because WinVerifyTrust would report that the incomplete file’s signature was corrupt. This problem was very commonly reported for large files (e.g. 50mb installers) by users in locations with spotty network access (e.g. where such connections are often interrupted).

    With IE10, we’ve reenabled the Content-Length / Transfer-Encoding checks in IE’s Download Manager. If the Download Manager encounters a transfer that does include the number of bytes specified by the Content-Length header, or the transfer fails to include the proper 0-sized chunk (when using Transfer-Encoding: chunked), the following message will be shown:

        IncompleteFile

    If the user clicks Retry, IE will attempt to resume (or restart the download). In many cases of network interruption, this feature helps ensure that the user is able to download the complete file. As a compatibility accommodation, if the retried transfer again does not provide the expected number of bytes, Internet Explorer will permit the download to be treated as “finished” anyway, so that users are not blocked from interacting with buggy servers.

    For instance, one buggy pattern we've seen is a server which delivers the HTTP response body as a single chunk, then calls HttpResponse.Close() instead of the proper HttpApplication.CompleteRequest.

      // Add Excel as content type and attachment
      Response.ContentType = "application/vnd.ms-excel";
      Response.AddHeader("Content-Disposition", "attachment; filename=" + binTarget);

      mStream.Position = 0;
      mStream.WriteTo(Response.OutputStream);
      Response.Flush();

      // BAD PATTERN: DO NOT USE. 
      // See http://blogs.msdn.com/b/aspnetue/archive/2010/05/25/response-end-response-close-and-how-customer-feedback-helps-us-improve-msdn-documentation.aspx
      Response.Close();                 

    Calling Close() like this omits the final chunk, and would cause the server's output to fail in the Download Manager if not for the compatibility accommodation.

    You can test how browser’s handle incorrect transfer sizes using these two Meddler scripts:

     -Eric


    Note: The “brain dump” series is akin to what the support.microsoft.com team calls “Fast Publish” articles—namely, things that are published quickly, without the usual level of polish, triple-checking, etc. I expect that these posts will contain errors, but I also expect them to be mostly correct. I’m writing these up this way now because they’ve been in my “Important things to write about” queue for ~5 years. Alas, these topics are so broad and intricate that a proper treatment would take far more time than I have available at the moment.

    Since IE6, Internet Explorer has implemented major architectural changes without accompanying breaking changes to its binary extension model. While new extension features have been introduced (e.g. Search Providers, Web Slices, and Accelerators), they are all based on markup rather than code and have been relatively straightforward to keep working from version to version.

    In contrast, Internet Explorer’s binary extension models: ActiveX, Browser Helper Objects (BHOs), Toolbars, etc, are all architected such that 3rd-party COM code runs within the Internet Explorer process. In many cases, extensions originally designed for IE6 (and earlier) continue to run without modification even in IE9 and IE10 on the Desktop. That’s despite the fact that virtually everything else around these extensions has changed: tabbed browsing and Protected Mode were introduced for IE7, Loosely-Coupled IE was added in IE8, Hang Resistance was introduced in IE9, and IE10 introduced Enhanced Protected Mode and other major changes throughout Windows. Each of these architectural shifts would break the majority of the binary extensions if not for a corresponding set of investments in compatibility features undertaken in each release of the browser.

    Windows Vista’s introduction of the Integrity Level system was accompanied by the UAC Virtualization system, designed to help accommodate applications that expected to be running with Administrative privileges. If a 32-bit executable’s manifest lacks a requestedExecutionLevel element (e.g. iexplore.exe’s embedded manifest doesn’t have one), then UAC Virtualization will be applied for file and registry operations. Browser extensions running in Internet Explorer benefit from this virtualization, enabling legacy add-ons that expect to be able to read or write to protected locations to continue working. Virtualization works by redirecting write operations from read-only areas to a per-user “virtualized” location. For instance, attempting to write a file to the Desktop from Low Integrity would ordinarily fail, but virtualization permits the operation to succeed by writing the file to a hidden folder elsewhere in the file system. (IE’s Low Integrity virtualization uses a shim to redirect writes to %USERPROFILE%\AppData\Local\Microsoft\Windows\Temporary Internet Files\Virtualized\, while UAC virtualization writes to %USERPROFILE%\AppData\Local\VirtualStore).

    However, virtualization alone isn’t enough to ensure compatibility. For instance, when tabbed browsing was introduced in IE7 and Hang Resistance was introduced in IE9, the behavior of windows and dialogs needed to be updated to be compatible with these features. For instance, when an extension in a background tab attempts to show a prompt, this prompt must be suppressed until that tab is activated (otherwise, a confusing experience would result). To accommodate that behavior, a system of shims and detours is used.

    These two technologies are similar:

    • Shims work by rewriting a module’s import address table at runtime to point to a different target function
    • MSR’s Detours work by rewriting the start of one or more target functions at runtime to point to a wrapper function

    These technologies allow Internet Explorer to intercept calls to important functions (e.g. CreateProcess, CoCreateInstance, CreateWindow, etc) and modify the behavior of those calls to improve compatibility with the restrictions and desired behaviors of the tab/content process in which HTML and add-ons run. For instance, the CreateProcess and CoCreateInstance APIs are wrapped such that the Protected Mode Elevation Policies can be applied. Similarly, CreateWindow is designed to accommodate the creation of new windows by background tabs, and to properly parent those windows to the correct window handle even though the window hierarchy was changed due to the hang resistance feature.

    In IE10, we’ve moved most functionality away from Detours to Shims for enhanced compatibility and because we’re shipping to a new platform (Windows RT) to which we otherwise would have had to port the IE version of Detours. In most cases, this was a seamless change, but we recently ran into one ancient toolbar that was impacted by the change.

    The toolbar in question was a simple one that offered a standard search box, a few notification icons, and a short set of menus that would launch dialog boxes to configure the toolbar and show information about it. Our compatibility testing team noticed that in IE10, the dialog boxes from the toolbar would never come up. Debugging native code extensions without source or symbols is never fun, but I decided to take a look anyway. I ran the installer and verified that the dialog boxes didn’t come up. Knowing nothing about the technology (e.g. maybe the dialogs were written in HTML), I took a quick look at the installation folder. I got an idea of how old the code was when I saw that the installation folder contained unicows.dll, an ancient library designed to help enable compatibility with pre-Unicode versions of Windows (e.g. 95/98).

    I next ran through the repro with IE10 running under the debugger and found a nested function deep inside a call to CreateWindow() was returning Access Denied. I then ran the same repro in IE9 under the debugger and found that CreateWindow succeeded, but observed that in IE9, there were detoured compatibility wrappers in the stack trace, but those wrappers were not present in the scenario in IE10.

    I spent several hours pondering this question and aimlessly touring around in the debugger. I was whining about this scenario to a colleague, complaining about code so ancient that it was shipping with unicows.dll, when I realized that I’d never used this library myself, and in fact I’d never seen a toolbar use it before. When trying to explain what it did to the colleague, I decided that I’d probably stop hand-waving and pulled up unicows up on Wikipedia. And bam, there it was, plain as day:

    By adding the UNICOWS.LIB to the link command-line [ ... ] the linker will resolve referenced symbols with the one provided by UNICOWS.LIB instead. When a wide-character function is called for the first time at runtime, the function stub in UNICOWS.LIB first receives control and [ ... ] if the OS natively supports the W version (i.e. Windows NT/2000/XP/2003), then the function stub updates the in-memory import table so that future calls will directly invoke the native W version without any more overhead.

    …and there’s the problem!

    When IE first loads a toolbar, the shims run against the module and wrap all calls to CreateWindow with a call to the compatibility wrapper function. But when IE loaded this toolbar, it didn’t find any calls to CreateWindow, because those calls had been pointed at a function inside unicows.dll instead of at the original function in user32.dll. As a result, the compatibility shim wasn’t applied, and the function call failed.

    Now, this wouldn’t have happened if unicows did its import-table fixup the “normal” way, using the GetProcAddress function. That's because the compatibility shims are applied to GetProcAddress as well, and the fixup would have been applied properly at the time that unicows did the update of the import table. However, for reasons that I thought were lost to the mists of time (see below), the implementers of unicows instead copied the source code of GetProcAddress from user32 into their own DLL, so the shims had no way to recognize it. While we could add a new shim to handle unicows.dll, the obscurity and low priority of this scenario mean that we instead decided to outreach to the vendor and request that they update their build process to remove the long-defunct support for Windows ‘9x.

    -Eric

    Update: Over on his blog, Michael Kaplan provided a history of why unicows.dll works the way it does. 

    PS: This MSDN article is a great resource that explains the PE file format and how linking and delay loading features work.


    When I first joined Office, I worked on the team responsible for delivering Help, Templates, and ClipArt into the client applications. As we were testing our work in various simulated customer environments, we found a big problem. At least one big customer (tens of thousands of licenses) had a network environment in which their users were forced to enter a username and password in order to authenticate to the proxy server. Without authenticating to the proxy, all HTTP/HTTPS requests were blocked.

    Now, this was a fairly uncommon architecture, even then, and is perhaps more so now. In most environments, either the proxy server doesn’t require authentication, or the proxy relies upon the NTLM/Kerberos authentication schemes which permit users’ Windows logon credentials to be automatically used to respond to challenges from the proxy server. Environments that relied upon BASIC or DIGEST authentication require that the user explicitly submit their credentials, typically once per process (because most networking components, e.g. WinINET would cache these credentials for the lifetime of the process).

    The problem with my features in Office was that they all passed the INTERNET_FLAG_NO_UI flag to WinINET, or ran atop WinHTTP, which explicitly doesn’t include any user-interface, including dialogs. The result of this was that in an environment with a BASIC/DIGEST proxy, all requests failed. In order to work properly in such environments, the application must itself supply the needed credentials to the network stack (e.g. for WinINET, call InternetSetOption, passing the INTERNET_OPTION_PROXY_PASSWORD and INTERNET_OPTION_PROXY_USERNAME option flags) to avoid the need to prompt the user.

    I added a new rule to Fiddler that made it simple to test products for this problem: 

    image

    When the Require Proxy Authentication box is checked, Fiddler automatically responds to any request lacking a Proxy-Authorization header with a HTTP/407 response containing a Proxy-Authenticate header specifying the authentication scheme required:

    GET /ua.aspx HTTP/1.1
    Accept: text/html, application/xhtml+xml, */*
    User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
    Host: www.enhanceie.com


    HTTP/1.1 407 Proxy Auth Required
    Connection: close
    Proxy-Authenticate: Basic realm="FiddlerProxy (username: 1, password: 1)"
    Content-Type: text/html

    <html><body>[Fiddler] Proxy Authentication Required.<BR> </body></html>

    A client that supports manual proxy authentication will then prompt the user for the username and password:

    image

    The client will then reissue the same request, supplying the provided credentials (base64-encoded) in the Proxy-Authorization header:

    GET /ua.aspx HTTP/1.1
    Accept: text/html, application/xhtml+xml, */*
    Proxy-Authorization: Basic MTox
    User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
    Host: www.enhanceie.com

    If the client fails to collect the credentials, it will typically treat the HTTP/407 response as fatal and will show an error message or fail silently.

    When you try this, you can find broken scenarios all over. For instance, when I tried to post this blog using Windows Live Writer, the following error message was shown:

    image

    Afterward, I was prompted to re-enter my credentials for the web server—there was no way to supply the credentials required by the proxy!

    Sometimes, an otherwise failing scenario may pass depending on what happens earlier in a process. For instance, if you enable the Fiddler rule, then launch IE to about:blank you will find that Search Suggestions from the Address bar don’t work, showing “An error occurred.”

    image

    Notably, if you subsequently navigate the tab to a web page, IE will prompt you for proxy credentials using the CredUI dialog box shown above. After you supply those credentials, the Search Suggestions feature starts working—that’s because the proxy credentials are cached for the lifetime of the process.

    In other cases, failure are silent and there’s no notice to the user. For instance, many background updaters are based on BITS/WinHTTP and will fail silently when a HTTP/407 is encountered. Similarly, Windows’ CAPI component’s Certificate Revocation Checks will fail because the svchost.exe process doesn’t have the required proxy credentials.

    If you need to sell your software into an enterprise that uses proxies, or just want to make your software robust against even uncommon network configurations, be sure to test manual proxy authentication scenarios!

    -Eric


  • 08/27/12--16:03: Downloading ZIP-Based Formats (chan 5189999)
  • More and more file formats are based on the ZIP format. The Open Packaging Conventions use ZIP as a base format, and that means frameworks like .NET’s System.IO.Packaging also generate files that are valid ZIP files. The Office 2007+ formats are ZIP-based, and more personally, Fiddler’s SAZ Format is ZIP-based.

    Unfortunately, this trend toward ZIP-based packaging incurs a problem when dealing with file types that are not registered in the server’s configuration. When sending unknown types, a simple server will typically send a Content-Type: application/octet-stream header, indicating very generically that the download in question is of a binary type without providing specific information. Internet Explorer’s MIME-sniffing code kicks in and says, hey, I see that you’ve provided a generic type. Lemme check that content and see if I know what it is.

    Now, the sniff for ZIP formats is dead-simple: Does the file start with 0x50 x4B (aka ‘PK’)? If so, then it’s probably a ZIP file. And in the case of ZIP-based formats, the browser’s technically right, but behaviorally wrong. If the server didn’t specify a Filename in a Content-Disposition: attachment header, Internet Explorer will promptly rename the file away from its original extension to .ZIP. The browser will then consult with Windows and determine that the .ZIP file should be opened by a MIME Handler.

    For instance, downloading from http://webdbg.com/dl/saz.saz results in the following modal prompt:

    image

    If you choose Open, the MIME Handler is invoked and shows the guts of the ZIP file:

    image

    If you choose Save, the file will be saved to your downloads folder as a .ZIP. This is generally not what you want.

    As a mitigation for this problem, Internet Explorer 9 included an exemption list for the most popular ZIP-based formats of 2010; downloads whose URLs bore the following extensions are not renamed:

    .accdt; .crtx; .docm; .docx; .dotm; .dotx; .gcsx; .glox; .gqsx; .potm; .potx; .ppam; .ppsm; .ppsx; .pptm; .pptx; .sldx; .thmx; .vdw; .xlam; .xlsb; .xlsm; .xlsx; .xltm; .xltx; .zipx

    To avoid this problem for all ZIP-based types, servers have two options:

    1. Send a specific MIME-type identifying the file’s type
    2. Use a Content-Disposition header to specify the filename

    For instance, when the server is reconfigured to send a Content-Type: application/x-fiddler-session-archive MIME, the user gets the expected Download Manager notification, and the file extension is untouched:

    image

    The changing web suggests that it probably makes sense to get out of the business of sniffing ZIP files, as such sniffing is likely now causing more problems than it solves.

    -Eric


(Page 1) | 2 | newer