Strip your URLs to super simple domains
The hunt for better RegEx
Since I've been working with frontend code, RegEx (short for Regular Expression) has been a constant and confusing minefield. RegEx has been around since the early 50's, and with it comes a tonne of power but also a tonne of complexity.
The trouble with RegEx is that it's wholly unreadable. So a bunch of different places have cropped up with ways to input expressions and have them decoded/validated quickly and simply. This makes thing a lot easier if you know what you're doing, but it's still hard to make sense of it by default.
Dan Eden wrote a great post on understanding RegEx from a designer/UX writer perspective, which offers a great way to make better sense of the complexity, but it still requires a tonne of work to construct a good pattern that works.
Clipping URLs
I was working on a personal SwiftUI app project to pull all of my Cosmic bookmarks into a native app so I could browse and enjoy them there, whilst also posting them publicly to my website.
Originally I just had the title and snippet text for each, which was fine (you can see this mirrored on my homepage) but I wanted to enhance this for easy scanning and for searching (using iOS' native search bar) by including the source URL.
The problem I had was clipping that URL. A lot of them came with a tonne of junk after the domain, which was mostly unnecessary, and all of them came with a prefix of https://www.
or https://
which was added noise. What I wanted to find was a RegEx pattern that would allow me to clip out that prefix and suffix info, so I'd be left with a simple kejk.tech
or docs.cosmicjs.com
for example.
Hunting StackExchange and the web in general gave me loads of expressions but nothing ever quite hit the mark for me. So I decided to try and craft my own expression and provide it below so you can borrow it for your project too.
The great thing about RegEx is that it's platform agnostic. I use the same expression in SwiftUI as I do in JavaScript.
I'm pretty confident that this is more verbose than it needs to be, but, it works every time and cleans things up a treat. So, I present to you, the URL trimmer.
const regex = /.*https:\/\/www\.|http:\/\/www\.|https:\/\/|http:\/\/|\/.*$/gm;
Here we look for any character in the URL string using .
and then match that with unlimited searches through the URL string with *
. After that, we basically have a bunch of OR matches using pipes:
https://www.
OR http://www.
OR https://
OR http://
Then at the end we ask it to look for anything that appears after the first solo /
by using a .
to match any character, then *
to perform this an unlimited number of times (again) and then terminate the search of the string with $
.
To close it all up, we use g
to ensure this is a global search and doesn't stop once it makes its first match (just in case), and then use m
to allow for multi-line searching and matching.
And that's it. It's pretty verbose, but it works across all URLs I've encountered and trims them up real nice. Below is how this appears when set up inside a React map. Here we create a new constant to fetch the URL from Cosmic and replace any occurrence of the regular expression with an empty value.
<div className="mt-4 grid grid-cols-1 gap-8 md:grid-cols-3"> {bookmarks.map((bookmark) => { const trimmedURL = bookmark.metadata.url.replace(regex, ""); return ( <a key={bookmark.id} href={`${bookmark.metadata.url}`} target="_blank" rel="noreferrer" > <BookmarkCard title={bookmark.title} subtitle={bookmark.metadata.snippet} date={bookmark.metadata.published} url={trimmedURL} /> </a> ); })} </div>;
Hopefully for any of you who do frontend work and might need something like this, you'll be able to take advantage. For everyone else, sorry!