Page speed optimization

Loading speed really matters. You can read this information everywhere from official Google sources to all blog posts written by gurus over SEO (search engine optimization) and UX (user experience from browsing a website).

In a nutshell, speed optimization is must-have thing today. Theoretical informations how to optimize a page to make it faster for load are written on many websites. If you do not want to optimize your page on your own, you can hire GameArter for this job. GameArter's main focus are its own projects - providing complex services for game developers and running websites as e.g. PacoGames.com or Games44.com. We deal with all common web-based, game-based and data-related issues on daily basis which provides us opportunity of getting great experience which we are happy to use further in client-based orders. Here's example of optimization processed by us.

Practical example: Page speed optimization of PacoGames mobile version.

When we talk about importance of page performance, it is obvious, that it is as twice as important on mobile devices. After all, this is clear also from Page speed insight by Google. This tool offers tracking for mobile and Desktop device separately while Desktop device usually gets far better score. Page speed insight was also reason why our colleagues from PacoGames asked us for a help to become “more green” in terms of speed.

PageSpeed insight statistics before page speed optimization PageSpeed insight score before page speed optimization

Optimization process

  1. Pre-optimisation tracking
  2. Page analysis
  3. Consultation with submitter
  4. Selecting optimization strategy
  5. Optimization itself
  6. Results
  7. Price and order duration

1. Pre-optimisation tracking

Page Speed Insight report is a good start and sufficient resource in most cases. However, to be able to report benefit of our optimization better for our colleagues at PacoGames, we did additional checks which will allow us to track and see benefits of the page speed optimization in real-time.

First tool we used is NewRelic.This tool allows us to track our app performance on backend as well as frontend on real users visiting the app.

Firstly we were interested in backend and infrastructure part. Usually, all problems at this side should be solved before the start of any speed optimization and adjustments on client side.

NewRelic - Server app speed statistics Speed of server application

Backend part of PacoGames did not indicate any problem which could affect speed on client side (in browser) so we could continue in next checks with sufficient certainty that we get undistorted data.

NewRelic - Client side application speed statistics Page load speed on a level of all pages

Complete time of page loading is sum of times required by web application, network, DOM processing and page rendering. This page is different for various types of pages.

NewRelic - Client side application speed statistics - Histogram Page load speed histogram on a level of all pages

For more complex times instead of average, here’s histogram graph of individual numbers of loads in certain loading time.

As a last step, we checked loading time in Google analytics (Behaviour / Site speed / Page Timing) and set regular tests from various part of world in StatusCake.

2. Page analysis

By quick check of the page, we found out following

  • Page is using up to date PHP, serving over http/2 protocol
  • Static resources (images, styles, javascripts…) are served over CDN, with right cache settings, in minified and compressed format.
  • Page is using simple DOM to render.
  • Page is using many images. They are in modern format webp (if possible), images are compressed and have optimized size and file-size. However, without lazyload practice.
  • TTFB (time to first byte) is not ideal in most geo locations
  • Although there is a visible effort for setting priorities of loading resources, it’s not optimal.
  • All ads on page are loaded immediately, not until in a time when a user can see them. (lazy load for ads)
  • Some javascript powered features can be replaced for more efficient css based features (e.g. smooth scroll)
  • Most used elements on the web - images and rating counters uses inefficient styling requiring higher number of DOM elements
  • Website uses own template including styles and javascripts, however their usage is not high and pages usually load some libraries which could not be necessary.
Page resources coverage useage Page resources coverage useage

3. Consultation with client

On basis of found opportunities we could start talk about scope of the contract. As a usually, we recommended to use effective Pareto principle - get 80% of result in 20% of time. In next sections of this post, we will always mark what work would be done within the pareto principle, and what work we would did if we optimized the web for 100% performance.

4. Optimization strategy

  • Optimization will improve page speed + reduce operation costs.
  • Optimization will take in account used technologies on the website.
  • Optimization will be made by adjusting time of resources loading and optimization of core javascript and style files.

5. Optimization itself

5.1 Resources optimization

The website was using 1 global style and few global javascript libraries. The result of this was that a browser had to load, parse and execute big files with mostly unused resources at every page. Statistics of used resources of every loaded file is possible to get via Coverage tool in developer tools of Google Chrome.
Pacogames was coded with respect of modern principles. Individual template parts are separated, built to final format via streaming build system Gulp.

Styles, using Stylus preprocessor, has separated file for every class. This allows to make manual, alternatively automatic segmentation of classes on basis of pages they are used on and on basis of the results to select new format of page-based styles.

JavaScripts, written in separate modules based on their purpose are associated to one file via webpack. Again, based on real usage on a level of pages, there is a possibility to split this one javascript to more files and more memory efficient files.

Option 1 - Cost (time) effective optimization startegy (pareto principle)

Styles and javascripts would be splitted into 4 separated files.

  • 1 base style and javascript file contains base functionality required at all pages of the website. Style file-size of this file is 10.9KB and js file-size is 1.9KB, loaded asynchronously. These 2 files are loaded via network on the first page only, all other pages load them from cache.
  • 3 Individual style and javascript files extending base files about style and javascript components required at individual pages.
    • Shared CSS and JS for main page and category pages
    • CSS and js file for game pages
    • CSS and JS file for all remaining pages
  • Website further contains separate scripts for other required features such as lazy loading, GDPR bars, notifications or worker services.
Option 2 - Complex optimization strategy

Website still uses 1 core css and javascript files with additional page-related style and javascript files, however, all these files are lighter, extended for multiple other javascript and style files which are being placed or loaded into a page in a time of their need. This way reduces weight of resources which must be loaded and processed furing a page load.

5.2 Elimination of resources

Every website uses resources for backward functionality, minor functions and many monetization and tracking scripts. There is always need to ask yourself, what is needed, what is worth of page slowdown. Pacogames required to preserve all functionality and resources (except on this website useless synchronously loaded Modernizr library in the header) so we skipped to a customization of loading priorities.

5.3 Customization of loading priorities

Option 1 - Cost (time) effective optimization startegy (pareto principle)

Based on pageviews we got from Google Analytics, we mainly focused on the mainpage. By default, there was loaded group of libraries for searching. Importance of such libraries is easy to detect - on pressing search icon. Thus, we removed all the libraries from the main thread and added a dynamic load for them, only for a time when they are really necessary. For simple management, we loaded a shared javascript file for this purpose which is then cached for all other pages.
After setting dynamic loading for libraries intended for searching and all its features (3 libraries + jQuery in total), there was time effective to remove also jQuery with rewriting a small part of javascript using it. After this work, homepage does not require any library by default and all scripts were set to be loaded asynchronously.

Other pages use the shared javascript file in default (same as file dynamically loaded for purposes of searching on mainpage), and possibly individual javascript files for page-bases functionality.

Option 2 - Complex page speed optimization

Complex page speed optimization is related to all pages of the website. In this case, there is no hybrid scheme of 1 shared library being used for multiple pages and use-cases, instead, each external component (is needed only after a certain user action) has own resources (style, javascript) which are being loaded and activated on basis of user's interaction. Simple examples:

  • You click at searching → searching style and required js libraries are loaded
  • You click at report → style and js libraries for reports are loaded
  • You scroll at comments → style and js libraries for comments are loaded
  • You scroll at component using carousel → libraries for carousel and touch events are loaded
  • Similarly for all other resources-heavy elements which are not necessary for majority of pageviews

All the libraries are independend file, with own knowledge of addictions, and are able to load other external resources (e.g. jquery) if they need it. Jquery or any other external library is not necessary for basic loades.

Website does not contain any separated scripts for features such as lazy loading, GDPR bars, notifications or worker services, all is effectively implemented within either basic or dynamically loaded resources.

5.4 Asynchronous load of nor-critical css

After our splitting of 1 global css file to 1 shared css file and other page-type based files we have option to apply asynchronous load of non-critical css files to prevent blockation of rendering (usually 180ms per every style file).

There is more ways of applying asynchronous load for styles, differing in browser support, form of implementation and required libraries. More about this topic is well explained at article Modern Asynchronous CSS Loading or on Google Page Speed insight help page.

All variants of asynchronous css loading requires javascript. Fortunately, support for browsers with disabled javascript is possible by keeping the same style line inside <noscript> attribute.

After some tests in various browsers, with a goal to keep way of loading css as affective as possible, we used this way of asynchronous css loading:

copy
<!-- Critical css loaded synchronously -->
<link rel="stylesheet" href="/css/general.css" media="screen">

<!-- Non-critical css loaded asynchronously -->
<link rel="stylesheet" href="/css/... page related css ..." media="none" onload="this.media='all'">
<!-- support for browsers with blocked js -->
<noscript>
    <link rel="stylesheet" href="/css/... page related css ..." media="screen">
</noscript>

5.5 LazyLoad

For image and ads loading we implemented own lazyload mechanism. Lazy loading saves number of requests on server which results in loading less files and consuming less bytes. Data are newly being downloaded only when a user needs them. Because of PacoGames had selection between webP and jpeg images format made by Cloudflare according to sent supported image formats by a browser in request header, we had to make light adjustment also on app side to keep support of webP image format on.

If you are thinking about use of LazyLoad for your application, there is many libraries to choose from, however only few of them is really light and effective. The easiest way is use of Chrome-native attrinute loading="lazy".

If you are looking for a cross-browser support of lazy loading, be sure your solution uses following fallbacks:

copy
<!-- Chrome native loading -->
if(ChromeNativeLazyLoad){ (cannot be used for ads)
	// use chrome native lazy load
} else if(interactionObserverSupport){
	// use lazy laod via interactionObserver
	// this is performance efficient comparing way with getBoundingClientRect()
	// it works well also in Iphone
} else {
	// classic most used way over getBoundingClientRect()
}

In Gamearter, we still prefer way over interactionObserver above native lazy load also for Chrom for a reason that interactionObserver allows us to implement our configuration for lazy load and thus achieve better results.

5.6 DOM size optimization

If you hire a coder for your website, and his delivered work looks exactly as your assignment delivered in graphics file, you are usually happy. However, every a bit more complex design element on a website can be made by multiple ways and although the result is alwas same, performance can be significantly different. A general rule related performance is to use as low number of DOM element as possible. Here's example of PacoGames rating element:

PacoGames rating counter
  • 5 DOM elements on 1 rating counter, made via transform: rotate | (defaultly used on PacoGames desktop before our optimization) | Performance on 1000 elements: 376ms
  • 7 DOM elements on 1 rating counter, made via svg element | (defaultly used on PacoGames mobile before our optimization) | Performance on 1000 elements: 412ms
  • 2 DOM elements on 1 rating counter, made via css gradient | (our solution, currently on desktop and mobile) | Performance on 1000 elements: 191ms
  • Next option is to use canvas element (1DOm on 1 counter element), but its usage is limited, so we did not tested this solution

Read about other mistakes which PacoGames made during hiring a coder.

5.7 TTFB optimization and operation costs reduction

Immediately from the first page speed insight result, there was clear that pacogames have serious issues with time of First Contentful Paint. Next test revealed the problem - TTFB time in most locations. The problem was, that PacoGames were using CDN service only for resources (js, css, images ...), but html document itself was everytime requested on a server in central europe. Although the server allowed full connection, time for delivering data over thousands of miles was significant.

TTFB reduction usually require higher cost investments. Ideal way is to use a cloud based service for running worldwide applications, the most popular is AWS (Amazon web services). AWS provide good TTFB times to all destinations (if there are no speed related issues in the server application), however, switch to Amazon require higher time investment and good knowledge of setup, otherwise AWS may be very expensive.

Beside AWS, there are also other ways to reduce TTFB time, all usually work in a principle that a complete page, including html is cached in a CDN near end user. Recently we can read many about Google's AMP.

For PacoGames, we used a similar solution. PacoGames is newly composed from static html pages (containing all static page sections) which are being cached in multiple levels (user's browser, CDN nearest end user, PacoGames server) while all dynamical and personalised content (user account, ads configuration, last played / favourite games and others) are being laoded dynamically into the static wrappers. All static wrappers are being regularly rebuilded to contain all up to date informations. This solution helped as to achieve better TTFB times which we would achieve with AWS, moreover, due to lower number of requests at servers (requests for static wrappers), there is lower requirement for server performance and thus option to downgrade and save monthly operation costs.

5.8 Replacing standard ads for AMPHTML ads

AMP ads can be placed on any kind of website by similar way to standard ads - either directly into a page or via Google Ad Manager.

AMPHTML ads are a faster, lighter and more secure way to advertise on the web. Although AMP pages support traditional HTML ads, these ads can be slow to load. To make ads themselves as fast as the rest of the AMP page, you can build ads in AMPHTML. AMPHTML ads are only delivered after being validated, ensuring that the ads are secure and performant. Most of all, these ads can be delivered anywhere on the web, not just on AMP pages.

https://amp.dev/documentation - intro to AMP ads.

5.9 Grouping of in-page scripts

Looking at most used browser core engine - Gecko - we can find many best practices for page-side optimization. One example can be grouping in-page scripts into same place within a page.

On basis of this Chromium source file, every <script> element located more than 50 elements from previous <script> element stops parsing and starts rendering. In the speech of speed, every such individually placed script in a page can cost extra 20ms of a loading time.

6. Result

With all these changes, we received following improvements (tracked for homepage) Optimitzation made in effective (pareto principle way). Later, after these results, we made additional yet optimizations, including points 5.6 and 5.7.

  • Saved 47% of request at server
  • Transfered (compressed) filesize reduced under 1 mb
  • DOM content loaded within 1 second
StatusCake statistics StatusCake loading speed tracking
Page speed before optimization Page speed after optimization
New Relic data - page speed comparration. Graphs display page speed reduction about 1.5 seconds, which is 20%.
Result image Result image
New Relic data - page speed histogram comparration. As you can notice, median of loading speed is now more associated in range 2-4 seconds.

Finally, page speed insight. Page speed and thus also PageSpeed insight score mostly depend on presence of ads and their load settings. By placing them to a page by classic way, ads cost substantial resources and thus loading time dramatically grow while PageSpeed Insight score falls.

As mentioned, we implemented lazy load mechanism also for ads which gave us option to optimize time of loading ads. The most important setting is especially for first - most visible ad.

Page speed insight score PageSpeed insight score with first ad loaded at the start. Final score after optimization is 75. It means increase by 41% (increase from 53 → 75).
Page speed insight score PageSpeed insight score with modified first ad loading timing. The ad is loaded later, shortly before the user sees it. The final score after the optimization is 99, an increase of 87% (increase from 53 → 99). If we would look at the impact of the 1 ad only, then it's increase from score 75 to 99 → +32%. This configuration is currently active on PacoGames.

7. Order duration

  • This optimization cost us 10 hours.
  • Setting tracking: 0.5 hour
  • Analysation: 1 hour
  • Consultation of options: 0.5 hour
  • Splitting styles 1.5 hour
  • Splitting javascripts: 1 hour
  • Adjustment of loading priorities: 1.5hour
  • Rewriting javascripts 2 hours
  • Final testing (optimization + functionality): 1.5 hours
  • Report: 0.5 hour