By on Feb 20, 2019 at 7:31 AM

On the submit and edit listing pages, WSN has javascript that automatically adds http:// to the start of the URL if someone forgets to type a scheme. Submit/edit also has backend logic that attempts to fix URL inputs. Despite this, I came across a client site with all sorts of junk data in that field.

Some of it may be very old data from before WSN automatically prepended the scheme. Some of it probably comes from old imported data. To address these scenarios, I've made the 10.4.4 Beta 3 and later upgrades seek out and fix botched listing urls. It processes only 50 per upgrade in order to keep the upgrade from risking a timeout due to too many queries. But it'll keep running on each future upgrade.

To check what bad URLs your site has, you can run this PHP at Admin -> Miscellaneous -> Advanced:
$q = $db->select('url', 'linkstable', "url != '' AND url NOT LIKE '%://%'");
$n = $db->numrows($q);
echo "<p>$n invalid urls</p>";
echo $db->rowitem($q). " ";

I've also added new normalization logic on all fields intended for URL input (which was already present in some cases but not all). Now, when the input in such fields doesn't contain a '.' (which every URL on the public internet should contain, though intranet URLs may not), it's assumed to be bad data and the field is blanked. But to not mess with anyone using WSN for intranet purposes too badly, I made an exception for that so that when the URL contains 3 or more forward slashes it's accepted even if it doesn't have a dot. Thus "http://pgk-desktop/dir" is valid but "http://pgk-desktop" isn't and "http://random junk" isn't, as of this release.

This brought up an issue. Suppose the URL field is required, and somebody enters a junk value in the field. WSN needs to tell the submitter that their submission is incomplete and they need to fill in the URL field properly. This might sound like it'd be a simple change, but alas it ended up requiring a ...

Read Full Blog Entry

