Difference between revisions of "User:Eighty5cacao/misc/HTTPS Everywhere/feature"

From Pin Eight
Jump to: navigation, search
(Override DNS lookups: this "SHOULD" be obvious, but...)
m (Pseudoplatforms: concise cmt)
Line 155: Line 155:
 
*sni - for sites that require [[wikipedia:Server Name Indication|SNI]] in order for a matching certificate to be obtained, such as those that use CloudFlare's free service tier. Compare the <code>snionly</code> attribute in Chromium's HSTS preload list (rulesets for such sites would have <code>platform="nopreload" subplatform="sni"</code>).
 
*sni - for sites that require [[wikipedia:Server Name Indication|SNI]] in order for a matching certificate to be obtained, such as those that use CloudFlare's free service tier. Compare the <code>snionly</code> attribute in Chromium's HSTS preload list (rulesets for such sites would have <code>platform="nopreload" subplatform="sni"</code>).
 
*tls11 - for sites that require TLS 1.1 or higher
 
*tls11 - for sites that require TLS 1.1 or higher
*tls12 - for sites that require TLS 1.2 or higher<!-- added "or higher" wording due to existence of a TLS 1.3 proposal -->
+
*tls12 - for sites that require TLS 1.2 or higher<!-- such as the TLS 1.3 proposal -->
  
 
== Override DNS lookups ==
 
== Override DNS lookups ==

Revision as of 00:20, 18 November 2014

This page is not intended for the developers to read directly. It is a scratchpad for me to collect my ideas if/when I file bug reports for these features.

Better ways to detect CAcert availability

The platform="cacert" code doesn't seem to detect manually-installed CAcert root certificates, at least for Firefox on Windows systems.

I assume the code that handles platform is looking for specific Linux distributions that ship CAcert root certificates in the operating system-level certificate store (TODO: verify).

A stopgap fix could be a pref that overrides the detection, like the one that has recently been added re mixed content blocking.

Allow user rulesets to be flagged as "superseding" a built-in ruleset

...so as to reduce noise in an automated problem-reporting system for users who are testing changes to built-in rulesets (and would currently need to disable the built-in ruleset in question).

Perhaps a first draft of this could be, "If a user ruleset has a name conflict with a built-in ruleset, (optionally) prefer the user ruleset and log the situation as a warning rather than an error" (user-visible UI warnings MAY be provided, and any part of the behavior MAY be conditional upon a configuration setting and/or an attribute of the user ruleset's ruleset element). A later implementation could cover the case where the user ruleset has a different name from a built-in ruleset, whose name is specified through an attribute like supersedes_builtin=.

User exclusion sets

Suppose there were a built-in ruleset named "Example0" that covered the domain bar.example.com, and badpage.htm on that domain suddenly broke (such as by redirecting to http). We should let users write something like:

<userexclusionset name="Example0 (emergency fix)" applies_to_ruleset="Example0">
        <target host="bar.example.com" />
        <exclusion pattern="^http://bar\.example\.com/badpage\.htm(?:\?|$)" />
</userexclusionset>

TODO: Consider whether the UI design demands a name attribute (should the user exclusion set be given its own line in the ruleset list, or should it silently apply without a checkbox choice?). Also: whether targets are needed, what syntax should be defined for "exclude from securecookie," and whether we should provide a mechanism to add coverage to an existing ruleset rather than merely exclusions (ticket 10033 mentions a work-in-progress Chrome implementation of the last).

Provide a means by which a rule in one ruleset can override a downgrade rule in another ruleset

The proposed attribute name is softdowngrade.

softdowngrade behaves like downgrade except that the code should check whether there is an enabled rule that would rewrite the opposite way (as there would be if both rulesets were enabled in the example below) and ignore the softdowngrade if so.

An example for a real site, specifically an xkcd comic that has unsecurable mixed scripts and fails to display any image when those scripts are blocked:

As a further example, a similar situation has already been found in the search feature on Stack Exchange sites (the downgrade in Stack-Exchange.xml ought to be a softdowngrade). (Does the fighting between the current Stack-Exchange.xml and Stack-Exchange-mixedcontent.xml cause redirect loops if the latter is manually enabled? This needs testing.)

This feature should be used only for pages that have true (unsecurable) mixed content that breaks major functionality (including, but not limited to, layout breakage severe enough to make the site unusable by an experienced, normally-abled user).

To consider: Instead of defining softdowngrade, is it better simply to give the existing downgrade attribute such "soft" behavior (i.e., explicitly give normal rules precedence over downgrade rules)? IIRC, currently, no code in the HTTPS Everywhere browser extension actually reads the downgrade attribute; it is read only by validation scripts as part of the build process.

New values for platform attribute

  • development - Enable in development branch of HTTPS Everywhere only (aside: consider using the development branch of HTTPS Everywhere in all Tor Browser Bundle branches)
  • mixedordev - Enable if HTTPS Everywhere is from a development branch or the browser lacks mixed-content blocking; needed because the existing platform system requires all platform tags to match, without a way to implement OR logic; the idea is that to promote testing, robotically-added mixedcontent flags should be changed to mixedordev for rulesets that have not since been edited by a human, as well as for special classes of sites like educational institutions and banks
  • mixedpost - Really means false mixed POST; Firefox checks for these separately from mixed content, so a fix for bmo:878890 might not automatically address this; true mixed POSTs should(?) generally be handled by splitting coverage of the referring page to a default-off ruleset if major functionality is broken
  • mixedxhr - when strict enforcement of same-origin policies causes problems with XMLHttpRequest calls (see torbug:7851)
  • nopreload - for browsers that do not include HSTS preload domain lists; rulesets generated from Chromium's HSTS preload list currently use platform="firefox", which is no longer accurate now that Firefox has some of the preloads (TODO: "Firefox enables HSTS preloading but intentionally rejects domains that do not send an HSTS header with expiration time greater than 18 weeks." Find the primary Mozilla source, and verify the specific contents of the list)
  • softmixed (or falsemcb?) - exact definition to be decided later; needed in order to distinguish "good" and "bad" MCB implementations
  • tor (or torbrowser) - needed if we ever want to allow hidden services as rewrite destinations; would initially be handled by the same validation scripts as downgrade, but more detailed validation could be performed later - mailing list discussion exists on whether this is worth doing at all

Pseudoplatforms

These should not disable any rulesets by default; they should instead alter the wording of the browser's TLS error pages to be more informative, specifically to explain that the needed TLS feature is most likely being broken by an intercepting proxy or webmaster misconfiguration. An initial implementation MAY treat these as no-ops.

These won't work under the current platform system, so perhaps define subplatform?

  • cnnic - for sites that legitimately use said certification authority; may include the website of CNNIC itself (unlike cacert and ipsca, this is a pseudoplatform because the CA needs to be manually distrusted; most users who choose to distrust CNNIC would rather give up the ability to access secured parts of the official CNNIC website than fall back to unencrypted HTTP)
  • rc4 - for sites that only support ciphersuites that contain RC4
  • sni - for sites that require SNI in order for a matching certificate to be obtained, such as those that use CloudFlare's free service tier. Compare the snionly attribute in Chromium's HSTS preload list (rulesets for such sites would have platform="nopreload" subplatform="sni").
  • tls11 - for sites that require TLS 1.1 or higher
  • tls12 - for sites that require TLS 1.2 or higher

Override DNS lookups

Suggested syntax: <dnsoverride host="sd.sharethis.com" ipv4_blacklist="184.72.49.139" />

To be used to work around broken load-balancing arrangements

There should also be positive ipv4 and ipv6 attributes, to force the use of the specified IP(s) for the specified hostname(s), even if the browser receives a single A record for some other IP.

(of course, ipv6_blacklist should be available too)

We should probably have different attribute names to specify hosts via either simple matches (like target) or regexes (like securecookie).

For a dnsoverride element to be effective for a given host, that host MUST also be listed in the ruleset's targets.

An attempt to blacklist an IP address corresponding to the only available A or AAAA record for a domain SHOULD be treated as a no-op and MUST generate a log message (TODO: at what severity?).

TODO: Decide how multiple IPs or hostnames should be delimited (comma? pipe? ...)

A Firefox implementation might depend on bmo:652295, though that bug isn't quite about overriding the built-in DNS resolver...

Load balancing

That is, allow a rule to specify multiple rewrite destinations among which one will be chosen randomly, to be used in cases where equivalent content is available on multiple hostnames. Some examples for real sites (just to show the syntax, not to demonstrate best practices):

To be reevaluated: The first time a load-balancing rule is hit, remember the random decision made for the current browser session, so that the rewriting of any given URL is deterministic within a session. (Or remember only for a limited time, in addition to not persisting between sessions?)

If it is considered undesirable to consume entropy from the browser's PRNG, perhaps some form of HMAC using the originally-requested URL as the message and the time the browser session started as the key might be suitable as a "random" number.

Advanced string operations

such as letter case transformations and percent (un)encoding. Perhaps the to field could contain something like $lc{1} to mean "lowercase version of the string matched by the first parens in the corresponding from"? This could be useful for dealing with redirection scripts:

<ruleset name="Example (partial)">
    <target host="foo.example.com" />

    <!--
        foo.example.com lacks HTTPS support of its own
    -->
    <rule from="^http://foo\.example\.com/redirect.php\?u=https(?::|%3[Aa])(?:/|%2[Ff]){2}(.+)&#38;bar=.*&#38;baz="
            to="https://$decode{1}" />
</ruleset>

TODO: explain other use cases

What an automated problem-reporting system should cover

The existing proposal(s) seem(s) only to cover rulesets manually disabled by the user. The problem is a limitation of the current UI: If a casual user doesn't bother to click on the icon or the Tools menu entry, they may not be aware that a redirect loop exists. If it is the top-level document that has experienced a redirect loop, they may think there is no rule coverage for that URL. Consequently, they might not disable the ruleset in question. Thus, a problem-reporting system should also handle redirect loops.

It's probably also a good idea to report SSL/TLS protocol errors for sites with active rulesets (certificate-related or not); among other reasons, such errors may not be noticed if they are triggered by third-party content.

(For everything between here and the top of the section, "ruleset" means built-in rulesets only.)

If reporting that a user has manually disabled a (built-in) ruleset, allow optionally reporting whether there are any user rulesets that are active for the URLs for which the built-in rulesets were found disabled - but don't report on the contents of said user rulesets, of course

Warn the user more loudly when user rulesets fail to load

...or possibly also when they are being disabled by default.

Theoretically, anyone technically oriented enough to work with user rulesets should be smart enough to validate their XML and regexes by eye (or script). However, people like me sometimes make stupid typos and then (1) get too lazy to check the Error Console and/or (2) visit websites that spam the Error Console heavily for unrelated reasons

Warn the user more loudly when redirect loops exist

  • Immediately after the redirect loop happens: infobar and/or change of menubar icon
  • Also, display an icon by the ruleset's line in the options window to signify whether the ruleset has ever been guilty of redirect loops during the current browser session

Cert overrides

The [[torbug:8958]] proposal should probably be a new element name rather than an attribute of rule elements; say, <certoverride host="deliveryimages.acm.org" accept_hostname="*.akamaihd.net" /> (a real example adapted from bmo:644640#c127). We should probably also define an "accept_fingerprint" mechanism to override errors other than mismatches. Should the host attribute be a simple match (as in target elements) or a regex (as in securecookie)? Or should we provide options for both?