I’m having an issue with Booleans and SOLR Pro and I’m wondering if others have had this issue.
I have a type that contains a lot of Boolean values and the FarCry type is set to Boolean. Therefore when I created the SOLR Pro type I set the SOLR Field Type to Boolean.
The problem I have is that regardless of the text I search for, every record is being returned. I’ve discovered that by changing the Boolean SOLR Field Type from Boolean to Text (making sure to delete the Boolean entry) the query works as expected.
I think I must have come across this before as I found notes in another project where I commented on issues with SOLR Boolean and had all those set to text as well.
What version of the plugin are you running? Does your Boolean field allow nulls?
I’ve seen in the past a bug where a date field that had null values was causing issues so records were not being indexed properly. I’m wondering if this bug is similar, but I thought when I fixed the date bug that I also fixed it for other data types. Also, that problem was an indexing issue. It was keeping records out of the index. In your case it seems that its returning incorrect values.
In any case, make sure you’re on the most recent version of the plugin which as of this is 1.2.11.
If you continue to have the issue, can you share your content type CFC that you are indexing, your schema.xml, and maybe a screenshot or a listing of your indexing setup for that content type (which fields are being indexed and which data types are used, etc)? That will help with debugging.
Plugin is now 1.2.11 and it’s running on FarCry 6.2.10. I was running an older version of the plugin until yesterday but upgraded in an attempt to fix the issue.
I’m not in the office so I can’t check but I’m 90% sure it doesn’t allow NULL values. However, the data is actually pushed from another system into FarCry and I’m 100% sure that system doesn’t allow NULL values for the Boolean fields. But I will double check this.
One interesting thing I noticed late last night was that even though all records were being incorrectly returned, the ones that should have been returned were always at the top of the list.
Even though I think it has zilch to do with this problem (as our production environment has been running like this since you and Jeff released SOLR Pro) just making you aware that the SOLR engine is installed on a different cluster to the websites. On a side note, it works really well like this.
We designed it so it could run on a separate server. If you continue to have problems share more about the configuration and I’ll try to help figure it out. We like to squash bugs.
Sorry it took me so long to get back to this Sean. Wanted to build custom type etc. to test with but have been side tracked with a major application launch.
This ZIP file contains two files. testSOLRpro is for your packages/types folder and the other is just a file to populate the type with some custom data (note: need to update DSN etc. in this file). It contains 5 records.
Here is what I noticed with this:
In SOLRPro I set the two boolean fields to “boolean” (all others to defaults).
Index the type (sample contains 5 records)
Used the search test that is built-in to SOLR Pro plugin** to search for one of the unique keywords (dog) and it returned all 5 records, which is wrong
Deleted the Boolean option in SOLR admin for both boolean fields and changed both to Text
Index the type again
Repeat the same search - now correctly only returns one record
As a test I only put one field back to Boolean and left the other as text, repeated the Index etc. and to my surprise it worked correctly when I searched
Put them both back to Boolean and it failed again. Tried this with more than 2 boolean fields and it always fails. So it seems that as long as there is only one boolean field it works fine
** I was using /webtop/admin/customadmin.cfm?module=customlists/solrTestSearch.cfm&plugin=farcrysolrpro to do my test searches. Figured that by using the one supplied with the plugin it eliminates it being anything related to a custom search form.
At first glance (haven’t run the code yet), the boolean fields do allow for null values. required="no" means null is allowed. I’ll let you know what I find.
True but the default is set to a value which I thought caused them to not be NULL. Aside though, the script that builds the sample data is setting a value for all Boolean fields anyway.
Finally had a chance to look at this. I noticed that it only does it if the boolean fields are stored. To be even more specific, if multiple boolean fields are stored. So you can have all 5 booleans use Solr’s “boolean” type, as long as you dont store more than one of them. Very weird.
So if you mark as “Not Stored” it works as expected. If you dont need to display this information (or dont mind querying the database to get it instead of just getting it from Solr) then that could be a work around.
I’m about to try a few different things to see if I can figure out WHY its happening. Very strange bug.
Wait. Sorry, never mind. My fault. It happens whether they are stored or not stored. I was running the wrong test query. Will respond further when I find out more.
OK, so I looked into this and I could not find a reason why it happens. On a whim I also tried upgrading Solr to 4.10.3 to see if it was a bug in Solr 3.5. Though I didnt get the plugin working with 4.10.3 yet I did get Solr itself running on my test index with your setup and it exhibited the same behavior.
So, my thinking at this time is to remove non-text fields from the search that is included with the plugin. My thinking is that, its a text based search, so having booleans, dates, numbers, etc in there is really no use. No text is going to ever match and even if it does, its unlikely that it is a relevant match anyway.
Jeff bought up the possibility of a “product number” but my thoughts on that are that a product number is like a zip/postal code. Yes, they are digits, but its not really a number. You don’t do math with it. So really it should be indexed in Solr as a string, text, or phonetic. If you do that it would be included in the search and would match if someone searched for a product number.
So, any thoughts on this? Can anyone think of a reason why non-string based types should be included in the basic, out of the box search that the plugin provides?
You can always write your own custom searches to search against booleans, dates, numbers, etc. Really, if you need to search those a custom search is already required, so I think this will not only fix this issue, but it should let searches perform even more quickly since it only has to search against a smaller subset of fields. So if you have a content type with a lot of indexed fields and many of them are not string/text then it should reduce the number of fields search substantially.
Thank you for taking a look at this. I feel a little less stupid now knowing that you also couldn’t find a reason for this behaviour And also not being able to get the plugin working with SOLR 4 as I previously attempted that as well.
Obviously I can’t speak for others but from my POV the non-string fields are very useful (especially boolean if they worked) as they can be used to greatly reduce the amount of records being retrieved. E.g. we have a boolean field for each of our course delivery types, online, face to face, distance, mobile units, satellite etc. We often need to filter the results down to, for example, online only courses. When searching we append a filter to the request to only get results with online set to true.
That said, maybe it doesn’t really matter if the Boolean field support is dropped.
On our sites, because the boolean type wasn’t working, I reverted to using these fields as string or integer (I think I had some problem with integer so settled on string). I don’t know if this has any impact on the performance of SOLR. Normally I would think when checking boolean vs. string, boolean would always be quicker but I’m not so sure with SOLR. So…as this seems to work fine maybe it isn’t any loss if the plugin didn’t support boolean etc.
As I noted. You’d still be able to use those fields to filter. They just wouldn’t be included in the “qf” parameter used in the out of the box search. You could still use them to further filter and of course could use them in any custom search you create.
OK, I have released a new build, v1.3.0 that makes this change. Non-string fields (int, long, float, double, date, location, boolean, array) will not be included in the field list for the default search that we include with the plugin.
Again, any field you index is available to you to use in custom searches you build yourself.
There is also an argument (bIncludeNonString) to the various functions that handle building the field list. You can always override the default search and modify it to include other fields as you find necessary.