UTM URL codes

From LinuxReviews
Jump to navigationJump to search

The UTM parts of web addresses are added purely for tracking purposes and they are not required to visit any web page. UTM stands for "Urchin Tracking Module" but they are, today, general-purpose tracking. UTM URL junk is specially popular among those who do marketing which is tied to Google's analysis products.

Identification[edit]

Consider this web link:

https://www.mozilla.org/en-US/firefox/69.0/releasenotes/?utm_source=firefox-browser&utm_medium=firefox-browser&utm_campaign=whatsnew

The actually useful part of that very long link is https://www.mozilla.org/en-US/firefox/69.0/releasenotes/

The rest of it - ?utm_source=firefox-browser&utm_medium=firefox-browser&utm_campaign=whatsnew - is junk,

  • utm_source= tracks where the link came from. In this case the supposedly "free" spyware browser Mozilla Firefox's website is the source.
  • utm_medium= tracks the device which was used.
  • utm_campaign= Marketers tend to put some identification here which helps track engagement by individual customer.
  • utm_term is also commonly used, if a search was done at the source then a term field typically share a search term with the world. If you searched for something embarrasing and you share the resulting link then the recipient can see what you searched for.

Links with UTM tracking codes are inherently evil and a security risk. If you want to share a link like the one in the above example then you absolutely should remove the questionmark and everything following it. Some websites do claim that "The danger level here is extremely low"[1] - however, it is worth being aware that UTM codes and specially the utm_term field can and frequently do contain somewhat personal information.

Re-Writing UTM Codes In Incoming Links[edit]

Some sites will send you traffic with tracking links in the URLs. A user who then copy-pastes your URL will include those UTM codes even if you did not intend for there to be any tracking codes in your links.

You can remote utm_source= tracking links and redirect to a link without that part of the URL using this Apache .htaccess code:

RewriteEngine on
RewriteCond %{THE_REQUEST} \?fbclid=(.*) [NC,OR]
RewriteCond %{THE_REQUEST} \?utm_source=(.*) [NC]
RewriteRule ^ %{REQUEST_URI} [L,R,QSD]

Blacklisting sites with UTM codes[edit]

If you are using YaCy or similar software to crawl the web then you will likely want to have these blacklisted.

<item>.*.*.*/.*utm_term.*</item>
<item>.*.*.*/.*utm_source=.*</item>
<item>.*.*.*/.*utm_medium=.*</item>
<item>.*.*.*/.*utm_campaign=.*</item>

A single URL can have hundreds of variations when the UTM codes are included - URLs with them should not be part of any web index.

Why would they do this?[edit]

Online advertising is big business and there is fierce competition. UTM codes allow "marketing professionals" to show their clients that a marketing caompain produced results. It's also a way to track clicks and visitors without using cookies or the HTTP referer field which more and more browsers remove.

Notes[edit]

  1. [https://www.maketecheasier.com/what-is-utm-source/ maketecheasier: What Is “UTM_Source” And Should You Be Worried?