4

Whose Cert Is It Anyway?

 1 year ago
source link: https://www.netmeister.org/blog/caa-diversity.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Whose Cert Is It Anyway?

May 14th, 2023

This is the third blog post on the topic of the centralization of the internet. The first post, discussing diversity of authoritative name servers, can be found here, the second post, discussing diversity of MX records, here.

Screengrab from the TV show 'Whose Line Is It Anyway?' with 'line' replaced by 'cert'. Remember the X.509 PKI? You know, the one that gave us such hits as "Oh wait, certificate revocation is basically all broken", The One Where That Dutch CA Issued A Fraudulent *.google.com Cert, and of course my all-time favorite, Honest Ahmed's Used Cars and Certificates? It's great, because it secures virtually all web traffic, and all you have to do is get a certificate from a certificate authority - any one at all!

That's right, no need to be picky: any certificate authority can sign any domain name, so you can pick from literally hundreds, since that is the number of trusted CA root certificates baked into your browser1 or included in most operating systems:

$ security find-certificate -a \
        -Z /System/Library/Keychains/SystemRootCertificates.keychain | \
        sed -n -e 's/.*alis"<blob>=//p' | wc -l
     166
$ security find-certificate -a                                         \
        -Z /System/Library/Keychains/SystemRootCertificates.keychain | \
        sed -n -e 's/.*alis"<blob>=//p' | more
"Go Daddy Root Certificate Authority - G2"
"HARICA TLS ECC Root CA 2021"
"SwissSign Platinum CA - G2"
"NAVER Global Root Certification Authority"
"[email protected]"
"OISTE WISeKey Global Root GA CA"
"KISA RootCA 1"
"Actalis Authentication Root CA"
"D-TRUST Root CA 3 2013"
"Apple Root CA - G2"
"StartCom Certification Authority G2"
"SSL.com EV Root Certification Authority ECC"
"Hellenic Academic and Research Institutions RootCA 2015"
"ePKI Root Certification Authority"
"AAA Certificate Services"
"VeriSign Class 3 Public Primary Certification Authority - G5"
"VeriSign Class 3 Public Primary Certification Authority - G3"
"Trustis FPS Root CA"
"Apple Root CA - G3"
[...]

But chances are, you really only want a very small number of CAs to do that -- the ones that you have a business relationship with or that you use for free. To solve that problem, the industry has tried a few things with varying degrees of success.

Possible Alternatives to CAA Records

For a while, we tried to tell the browser which CAs can issue a cert for a given domain via dynamic HTTP Public Key Pinning (HPKP), an HTTP response header (Public-Key-Pins). Like e.g., HTTP Strict Transport Security (HSTS), this does not address the Trust on First Use issue; in addition, it was quickly identified as a pretty big footgun and was again deprecated and support for it removed from the browsers. Except, of course, static HPKP, whereby pins are baked into the browsers remains alive2 (and likely forgotten by the various companies who submitted their pins years ago).

Certificate Transparency3 was supposed to make up for (dynamic) HPKP being deprecated, but of course that shifts the defense mechanism from prevention to detection. Monitoring all certs in the logs for all of your domains is far from trivial: accounting for typo- and bitflip squatting, insult domains, and reserving every language variant of your trademark in the almost 1500 TLDs, many large organizations end up with literally thousands of domains to keep track of. No surprise CT Monitoring As A (Paid) Service is now a thing...

And of course there's also a solution that works perfectly well but isn't used at all because it depends on DNSSEC: pinning your cert in the DNS using DNS-based Authentication of Named Entities, aka DANE, but that aside...

CAA Records

...the preventative mechanism that has seen at least some adoption is the use of Certification Authority Authorization or CAA DNS Resource Records, specified in RFC8659.

Checking CAA records was made a requirement for Certificate Authorities via CA/B Forum Ballot 187 in 20174. The idea here is that you specify in the CAA records the name of the CAs that you wish to grant authorization to issue certificates for the domain in question. Sounds straight forward, right?

Unfortunately, there are a few pitfalls to consider. On the one hand, the determination of the CAA record to use for a given FQDN is performed as a left-to-right first match. This is useful, because it allows you to have different records for sub.domain.example.com and domain.example.com, with perhaps a catch-all record set on the second-level domain (example.com). (And yes, you can have CAA records on a TLD, but as of early May 2023, no TLD has one set.).

Where this gets complicated, however, is when it comes to CNAME records. Per RFC2181, a given label in the DNS may not have any other records if it has a CNAME record (except the associated DNSSEC records), and the CAA resolution must follow the CNAME. This gets messy quickly.

The other is that you have to have your act together for all of your domains: you need to know which domains are used where and how, which may have subdomains CNAMEd to third parties, which have subdomains you delegate, which you use for internal versus which you use for external use etc. Many large organizations are really, really bad at this.

But alright, as so often, it is what it is. Still better than allowing Honest Ahmed and everybody else to issue certs in your domains. So let's take a look at how widely used CAA records actually are.

Use of CAA records

Like before for NS and MX records, I once again pulled down the various gTLD zone files and combined them with whatever ccTLD data I could get my hands on, ending up with just around 214 million domain names in almost 1200 TLDs. In addition, I also took at a look at the Tranco Top 1M Domains list and compared results for all TLDs and the Top 1M domains.

In total, fewer than 3 million domains have CAA records; fewer than 50K for the Top 1M domains. That's barely 1.4% of all TLDs or 4.8% of the Top 1M domains -- not that great, adoption wise.

pie chart showing % of domains with CAA records vs without
pie chart showing % of Top1M domains with CAA records vs without

Of those domains that do have CAA records, what do they look like? RFC8659 defines the resource record to be of the format CAA <flags> <tag> <value>. The tag-value portion is called a property, and each domain may have zero, one, or more properties defined.

The majority of domains that do have CAA records set appear to use a small number of CAs, commonly <=5, which then adds around 10 records in total, which is indeed the most frequently seen number of CAA records:

# of CAA records # of domains

10 1,060,973
8 841,310
1 420,128
2 313,908
3 65,862

Of course there are outliers, too: almost 900 domains have over 20 CAA records, and some domains have even more than 50!

# of CAA records domain name

59 benemortasia.us.
59 lifelessandcalm.com.
59 unorganized.email.
57 benemortasia.com.
36 estrategiaadigital.fun.

CAA flags

The flags field should practically be exactly either 0 or 128, as no other values are currently defined. But this being an RFC, it's of course needlessly complicated and easy to misunderstand: the Issuer Critical Flag is bit 0 of the flags field, and not the value of this field. That is, to set bit 0, you have to specify a value of 128; a value of 1 still leaves that bit unset.

It's therefor not surprising to find the top flags encountered to be:

flag # of records   comment

0 20,064,100   valid, critical flag unset
1 219,928   invalid, critical flag unset
128 49,775   valid, critical flag set
10 735   invalid, critical flag unset
250 18   invalid, critical flag set

(There are an additional 50 other values found, ranging from 2 to 250, with no clear indication what people thought those values might mean.)

CAA properties

RFC8659 defines three different properties: issue, issuewild, and iodef. That's it.5 But of course you won't be surprised to find that across all the domains analyzed, we find over 100 additional words, including different misspellings of those three properties (e.g., issiue, issuewld, iodev) and what seems like guesswork based on expected functionality (e.g., enable). The overwhelming majority of records are, however, correct, and break down across the three valid properties like so:

pie chart showing % CAA records by tag type / property
pie chart showing % CAA records by tag type / property

Not surprisingly, the majority of organizations implementing CAA records want to restrict issuance, with most also utilizing wildcard issuance restrictions. What is a bit surprising, perhaps, is that only a very small number of organizations appears interested in receiving reports of attempted unauthorized issue requests. (But that is likely explained by the fact that RFC6844 makes honoring iodef optional ("...MAY report..."), and at least Let's Encrypt has publicy stated that they do not send mails on failed issuance due to CAA.)

The number of domains using any combination of these three properties is shown in more detail in the table below.

# of domains with... all TLDs Top 1M domains

issue 2,851,746 151,046
issuewild 2,173,641 110,532
iodef 182,461 8,622
issue and issuewild 2,139,747 28,910
issue, issuewild and iodef 86,098 4,041
issue and issuewild 2,139,747 28,910
iodef and issue 178,556 8,265
iodef and issuewild 87,840 4,195
either issue or issuewild and not iodef 2,705,342 24,869
only issue 711,999 17,555
only issuewild 33,894 1,856
only iodef 2,163 99

iodef

RFC8659 defines three valid methods for CAs to report requests for issuance that violate the policy: mailto and http(s). For the most part, domains get this right, and not surprisingly prefer the simpler mailto reporting mechanism:

iodef method all TLDs Top 1M domains

mailto 174,230 8,248
raw email (invalid) 7,294 248
https 166 24
http 18 0

Most domains have a single iodef record, although some have multiple, while others clearly misunderstood the proper syntax of the RR, and at least one is using the record as a Log4Shell canary:

$ host -t caa elevate.services | grep iodef
elevate.services has CAA record 0 iodef "mailto:[email protected]"
elevate.services has CAA record 0 iodef "mailto:[email protected]"
elevate.services has CAA record 0 iodef "mailto:[email protected]"
$ host -t caa smartroom.com | grep iodef
smartroom.com has CAA record 0 iodef "comodoca.com"
smartroom.com has CAA record 0 iodef "usertrust.com"
smartroom.com has CAA record 0 iodef "trust-provider.com"
smartroom.com has CAA record 0 iodef "mailto:[email protected]"
smartroom.com has CAA record 0 iodef "sectigo.com"
$ host -t caa kyhwana.org | grep iodef
kyhwana.org has CAA record 0 iodef "mailto:[email protected]"
kyhwana.org has CAA record 0 iodef "${jndi:ldap://baylwjkcgkp30xx2ut082owpu.canarytokens.com/a}"
$ 

The most frequently used iodef records are shown below:

pie chart showing % CAA iodef records
pie chart showing % CAA iodef records

Note the dominance of [email protected] for the iodef records. I'm pleased to see this, since setting the right CAA policy and adding default CAA records for all of Yahoo's (many) parked domains was something I pushed for at my time there. Yay! \o/

issue and issuewild

Ok, so now let's see what CAs the different domains authorize. In total, I found almost 2,200 distinct issue records (for domains in all TLDs, 456 distinct for the Top 1M domains) and 878 issuewild records (all TLDs, 227 Top 1M).

The various misspellings and otherwise invalid records aside, the top 20 CAs in these records are:

issue records (in all TLDs) count   issue records (in Top 1M Domains) count

1. letsencrypt.org 2,769,264   1. letsencrypt.org 38,218
2. digicert.com 2,059,878   2. digicert.com 29,777
3. comodoca.com 2,010,652   3. comodoca.com 24,098
4. globalsign.com 1,901,486   4. pki.goog 19,078
5. sectigo.com 1,300,807   5. globalsign.com 9,522
6. pki.goog 384,434   6. sectigo.com 8,632
7. trust-provider.com 157,727   7. amazon.com 5,382
8. ; 79,788   8. amazonaws.com 2,545
9. amazon.com 70,065   9. amazontrust.com 2,139
10. certum.pl 32,870   10. godaddy.com 2,020
11. entrust.net 23,103   11. awstrust.com 1,998
12. godaddy.com 22,537   12. entrust.net 949
13. geotrust.com 14,587   13. certum.pl 620
14. starfieldtech.com 13,776   14. ; 417
15. ssl.com 13,484   15. quovadisglobal.com 407
16. amazonaws.com 13,051   16. geotrust.com 395
17. amazontrust.com 10,922   17. symantec.com 354
18. awstrust.com 10,500   18. trust-provider.com 338
19. rapidssl.com 9,549   19. thawte.com 318
20. comodo.com 7,968   20. comodo.com 268

What you see here shows the overwhelming majority of CAA records using just a handful of CAs. (The use of ; signals that no CA is allowed to issue a certificate for the domain in question; this is used primarily for parked and otherwise unused domains.)

But recall what RFC8659 says about the meaning of these records:

If the issue Property Tag is present in the Relevant
RRset for an FQDN, it is a request that Issuers:

1. Perform CAA issue restriction processing for the FQDN, and

2. Grant authorization to issue certificates containing that FQDN
   to the holder of the issuer-domain-name or a party acting under
   the explicit authority of the holder of the issuer-domain-name.

(Emphasis mine.) Who is the "holder of the issuer-domain-name" for geotrust.com, rapidssl.com, or thawte.com? That's right: DigiCert. That is, by specifying, say, geotrust.com in your CAA record, you are implicitly also granting the various DigiCert subsidiaries authorization. So we can collate many of the above records, which then gives us a breakdown of the most popular CAs used in CAA issue and issuewild records:

pie chart showing % CAA issue records
pie chart showing % CAA issue records
pie chart showing % CAA issuewild records
pie chart showing % CAA issuewild records

Or, if you prefer Pareto charts:

pie chart showing % CAA issue records
pie chart showing % CAA issue records
pie chart showing % CAA issuewild records
pie chart showing % CAA issuewild records

Extensions

As noted above, even authorizing a given CA can still end up being rather broad, and you may well want to have much tighter restrictions, such as specifying which specific account under a given CA may request certificates for a domain, or how the CA should validate the request. For this, RFC8657 specifies a few extensions: the accounturi parameter and the validationmethods parameter.

There is also a draft on Signed HTTP Exchanges within the Web Packages group that adds another parameter: cansignhttpexchanges. As of May 2023, it looks like the only CAs supporting this parameter are digicert.com and pki.goog (see e.g., DigiCert's documentation as well as a discussion on the Let's Encrypt forum), although I also saw a very small number of domains setting this parameter on records authorizing letsencrypt.org, sectigo.com, amazon.com, and globalsign.com. (I'm guessing those were set, but not honored.)

In addition, I encountered three more extension parameters that appear to not be well documented: policy=ev (found only in combination with comodo.com), root=g1-class3 (found only in combination with cacert.org), and account= (found only in combination with letsencrypt.org, digicert.com, cacert.org, and Amazon's CAs). It is not clear to me whether these are actually supported by the different CAs, or if they are opportunistically or mistakenly set by the domain owner

The use of these extension parameters broken down by number of domains using them looks like this:

extension parameter count (all TLDs) count (Top 1M Domains)

cansignhttpexchanges 259,245 17,108
validationmethods 559 43
accounturi 243 29
account 163 29
root 11 0
policy 9 4

validationmethods encountered were dns-01 (dominant), http-01 and tls-alpn-01; accounturis were primarily under https://acme-v02.api.letsencrypt.org/, with just a handful under https://acme-v01.api.letsencrypt.org/ and https://acme-staging-v02.api.letsencrypt.org/.

Summary

Having analyzed around 214 million domain names, here are my main findings:

CAA records are still not widely used.
Across all TLDs, only 1.4% of domains use CAA records; out of the Top 1M Domains, only 4.8%. Considering that CAA records have been around since 2010 and honoring them has been mandatory for CAs since 2017, this seems like a poor adoption rate, likely because (a) the PKI threat model it addresses is poorly understood; and (b) the implementation can lead to difficulties if the use of domain names and third party services used is not clearly organized.

Most people don't set iodef.
Those domains that do use CAA records tend to use the issue (52% for all TLDs, 55.9% for the Top 1M Domains) and issuewild (46.9% and 40.9%) records, but only miniscule fraction (0.9% and 3.2%) set iodef. This may be a sign that organizations generally are not well prepared to handle error reports, though even if honoring iodef is optional in the RFC, I am a bit surprised by these abysmal numbers.

Extensions are not widely used.
This is not surprising, since they require subject matter expertise that, frankly, is absent in most organizations. What is surprising, to me at least, is that the non-standard cansignhttpexchanges extension is so dominant here. I suspect this is something that is being pushed by Google -- hence the frequent use on pki.goog -- as part of the "Accelerated Mobile Pages" (AMP) framework, but no industry wide consensus seems to have built up.

A small number of CAs dominate.
This is not surprising, but the concentration is still stark: seven Certificate Authorities account for over 99% of all CAA issue and issuewild records (10 CAs for 99% of the Top 1M Domains); three alone for over 75%: Comodo, DigiCert, and Let's Encrypt.

Even though this only covers the small percentage of domains that do set CAA records, I would not be surprised if the overall use of CAs across all domains followed a similar distribution. (In some markets, regional players will play a bigger role; once again the inability to get access to all ccTLD zones makes this difficult to assess.)

If you're wondering whether you really need to have over 160 different CAs in your trust bundle, I suspect the answer is "no"; you could likely get away with fewer than 20 and wouldn't notice the difference. But whether that's a good thing, whether it's wise for the entire internet to place all -- well, >99% -- of its certificates/eggs into fewer than 10 CAs/baskets seems more than questionable.

May 14th, 2023


Footnotes:

[1] I use the term "browsers" here as if all browsers implemented the same features. Of course there are differences, but since basically all browsers are Chrome now anyway, they are, sadly, becoming increasingly less relevant. Whatever Chrome does is what "the browsers" do now.

[2] NB: as of 2023-05-16, it looks like only Google, Facebook, the TorProject, and Yahoo have static pins in Chrome. Considering that changing or updating your static pins requires the release and propagation across all markets you care about -- of multiple browsers, no less -- it might be time to deprecate that, too.

[3] CT is nowadays enforced in the browsers1, which is why the Expect-CT header, defined in RFC9163, was pretty short-lived.

[4] It's worth noting that compliance with CAA records, like Certificate Transparency and some other restrictions, is not required for root certs that you (or your organization's IT policy) installed in your trust bundle yourself.

[5] CA/B Forum Ballot SC13 and Ballot SC14 added contactemail and contactphone to allow domain owners to provide information that increasingly is hidden in WHOIS. But these are not defined in the RFC and very rarely used: not only 741 out of all domains observed used contactemail (54 out of the Top 1M Domains), 23 contactphone (3 out of the Top 1M Domains).


Links:


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK