October 15, 2018

Thoughts on CDNs

Edit 2019-08-30: add misinformation paragraph.

CDN are great tools. They allow to distribute dependencies of web pages efficiently. However, they might be breaking the Internet: here is why.

Why would you use a CDN?

DRY

The main idea behind CDN is the good practice of avoiding repetitions (DRY). Nowaday, the vast majority of web pages contains JavaScript libraries. With a CDN, the library is distributed through a third party service: instead of loading it from the website you are visiting, you load it from somewhere else. Those services are optimized for speed and high capacity. Most of the time, they distribute popular libraries such as AngularJS to the point that they are often referenced first:

AngularJS download page with link to CDN before npm/bower commands

For developers, they allow an easier installation of dependencies. A simple copy/paste and that’s it. Otherwise, you have to download it manually or with a package manager such as npm, and link the correct file from your code by yourself.

What consumes time?

When a page loads, what usually takes the longest time is round trips between the client and the server. So, in order to load your pages faster, you need to:

  • reduce the number of requests
  • execute requests in parallel
  • rely on cache when it’s relevant

With a CDN, the goal is to rely on cache as much as possible to avoid reloading each dependency on every website. That way, if a dependency is used on multiple pages, even across multiple websites, you effectively download it only once.

Problems with CDN

Dependencies’ versions

When you depends on, let’s say jQuery, you obviously want the exact version you developed and tested your site on. It is only that way you can avoid unexpected behaviors.

Nowadays, the JavaScript ecosystem is extraordinarily dynamic. Many developers release new versions of their libraries on a monthly basis or even quicker:

In this multiple versions context, the advantages of caching are almost all lost.

Single point of failure

Obviously, CDN are creating SPOF by depending on a global address that is shared between numerous websites.

Great firewall of China

In China, the state’s firewall, commonly known as ‘Great firewall’ blocks a non negligible percentage of CDN providers. Other countries where Internet access are restricted may also experience that kind of issues.

Internet was made to be international, even if your website is not translated, it might be a bad thing that it’s available to less people. Furthermore, you will not even notice it.

Problems with technologies such as ReactJS

CDNs creates a kind of style/JavaScript apocalypse when they break down.

On a ‘traditional’ page, the style might break and you may experience some issues with JavaScript, but you will at least have access to the page’s content.

Whereas if your website depends on client side rendering libraries such as ReactJS, it can make the content unavailable since the library is used to get and render text.

Tracking, malicious behavior and security

CDNs, can also track people by analysing the number of requests processed by the servers and their origin. They are even proudly publishing it. Are the developers aware of the eventual side effects of such practices? What if your website is the only one to use a specific library? That would mean exposing your visitor’s statistics publicly without even realising it!

They are also a very interesting vector of attack. They send arbitrary code! This can be mitigated by checking code integrity, but it’s rarely implemented in practice.

<script
    src="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/js/bootstrap.min.js"
    integrity="sha384-ChfqqxuZUCnJSK3+MXmPNIyE6ZbWh2IMqE241rYiqJxyMiZ6OW/JmZQ5stwEULTy"
    crossorigin="anonymous">
</script>

Pay attention to integrity property. It contains a hash of the resource located at href. That way, if the resource changes, the browser will be aware of this and will not execute suspicious code. However, according to MDN, this feature is still experimental.

Let’s hope that these services will not implement advertisement in your favorite JS library. Have you ever thought about their economic model?

Misinformation

Check https://jamstack.org/best-practices/: this website is full of good ideas, but pushes hard for developpers to rely on CDN. This website is a Netlify initiative. Now let’s check their main website on https://www.netlify.com/pricing/: looks like they are actually selling CDN services. What a coincidence! Some companies that are seeling CDN services and are part of the web developpers community are pushing for everyone to use their products. It’s just their interest, but now I just see Jamstack as a huge ad for their services.

Conclusion

First of all, as a user, you should use Decentraleye. This module contains the most popular libraries. Once installed, when a request is sent to a CDN, it is redirected to local data. It’s an absolute zero cost solution and you will get better performances as a bonus since you are loading libraries from local instead of network.

A schema describing how Decentraleye works

As a developer, there seems to be two opposed good practices: DRY and avoiding a single point of failure. A good solution would be to set a fallback system. However, there is also a freedom matter, that’s why I don’t use CDN.

Tools such as Google Webfonts Helper can help you deal with this.

The main idea is that you should not rely on third party service to keep your website up.

From my point of view, many (lazy) developers use this to avoid manual library management. This task however is definitely required. If a developer is not able to do such a simple thing and to think about the whole impact of that kind of choice, you should definitely not trust his code.

Finally, caching and CDNs are two different things. CDN providers are volontarly bluring the lines for marketing purposes. While CDNs can help you improve a bit your caching, there are lots of much better solutions to implement first: manual cache control, minification, resource shrinking…

Except where otherwise noted, content on this site is licensed under CC BY-SA 4.0.