Repeating FriendlyURL/Orphaned navigation

Hello all. I have two issues I have been struggling with, maybe related maybe not. These were uncovered thanks to the SolrPro plugin, though I I see no other likely relation. We’re currently on FarCry 7.2.8. Bullet points at the end are a very brief summary of my points if you don’t want to read all this to find my problems.

Sometimes we come across sites with a system generated friendly URL that will read as (for example) ‘/address/address’, repetition seems to be a key factor here. After clicking on one of these links users are told “you have been denied access to this item” with nothing else on the page, not even the sites header and footer. Google Analytics still finds its way through but that’s it.

When I dig out the object id for the pages parent navigation node and enter that in a Login pops up. This login fails for an account with admin privileges. Also, I’ve pulled out the breadcrumb trail SQL so that I can find where a given node is located in our site. Whenever these kind of errors have come up this SQL query comes up empty. I did repeat this once by keeping watch on a page a couple levels down from a node I deleted, which led to replication of the outlined behavior. I wonder if FarCry isn’t recursively deleting descendants of a given node?

I have no idea where these Friendly URL’s are coming from. Cleaning them out and removing orphaned data, if that’s what’s happening, would be nice.

Somehow removing them from our Solr index could be an acceptable stopgap but I’m not sure how I’d make that happen, if it’s possible. Maybe @seancoyne or @jeff have a trick up their sleeve?

Summary since that’s a lot of text:

  • Repeating Friendly URL’s, Example: ‘/address/address’, leading to access denied pages

  • Navigation nodes appear appear to not delete recursively, leading to orphaned content

  • If it’s faster or easier to make Solr avoid these I’d be happy with that solution, since that’s why we know these exist.

Happy to dig further for the curious, I hope I’ve defined the problem well.

You can write your own contentToIndex function to specify exactly which items are indexed. So you could created one for dmnavigation that limits it to items in the tree.

Sorry away from my desk so hard to comment on the other items. I’ll try to do so when I have a chance.

Sounds promising. I found the example on doing this on the FarCry Solr Pro documentation page. I’ll try it out and ask here if I get stuck.

In testing it looks like I’ve successfully filtered out all content that can’t be traced back to the root node. The query I am using runs through nested_tree_objects until it finds the root node and excludes anything that doesn’t make it.

If there is a better way to ensure I’m only including content from the site tree I’m happy to hear it.

As an aside the sample contentToIndex code from the FarCry Solr Pro documentation page was very easy to work with, since I only needed to change the query for my use. The only way it could have been easier is if there was a function call ready to go that I just passed the query into. Thanks again @seancoyne for the tip.

This removes the worst symptoms of this junk data that’s gathering in our DB. We are interested in cleaning it out if we can, so pointers on doing so properly would be helpful. If there is a legitimate bug hidden in here I’d be happy to help pinpoint it as well.