Solr

Daniel Miller dmiller at amfes.com
Fri Dec 21 18:19:42 EET 2018


Joan,

The reason for dropping squat, I'm assuming, is that Lucene and Solr 
potentially provide superior features & performance and as they are 
3rd-party libraries & apps it reduces the maintenance responsibilities 
and let's the Dovecot team focus on mail server specific stuff - and let 
others focus on FTS.  There is a *huge* difference between a functional 
Solr setup & squat - and if I'm able to get it working we should be able 
to get you there as well.

I don't recall what OS you're running - I'm on Ubuntu 18.04.  My Java 
version is OpenJDK 10.0.2.  Attached is my complete Solr config.  Try 
one more time - stop the server, delete the data folder, unpack the 
attached into the conf folder - and restart.  I also have


/etc/default/solr.in.sh:
SOLR_OPTS="$SOLR_OPTS -Dsolr.autoSoftCommit.maxTime=3000"
SOLR_OPTS="$SOLR_OPTS -Dsolr.autoCommit.maxTime=60000"
SOLR_PID_DIR=/run/solr
SOLR_HOME=/usr/local/lib

Adjust the above folders as appropriate - or don't use them at all if 
you're using the defaults.


/etc/systemd/system/solr.service:
# put this file in /etc/systemd/system/ as root
# below paths assume solr installed in /opt/solr, SOLR_PID_DIR is /data
# and that all configuration exists in /etc/default/solr.in.sh which is 
the case if previously installed as an init.d service
# change port in pid file if differs
# note that it is configured to auto restart solr if it fails 
(Restart=on-faliure) and that's the motivation indeed :)
# to switch from systemv (init.d) to systemd, do the following after 
creating this file:
# sudo systemctl daemon-reload
# sudo service solr stop # if already running
# sudo systemctl enable solr
# systemctl start solr
# this was inspired by 
https://confluence.t5.fi/display/~stefan.roos/2015/04/01/Creating+systemd+unit+(service)+for+Apache+Solr
[Unit]
Description=Apache SOLR 7.5.0
After=syslog.target network.target remote-fs.target nss-lookup.target 
systemd-journald-dev-log.socket
Before=multi-user.target graphical.target nginx.service dovecot.service
Conflicts=shutdown.target
[Service]
LimitNOFILE=65000
User=vmail
Group=mail
ExecStartPre=/bin/mkdir -p /run/solr
ExecStartPre=/bin/chown -R vmail.mail /run/solr
PermissionsStartOnly=true
PIDFile=/run/solr/solr-8983.pid
Environment=SOLR_INCLUDE=/etc/default/solr.in.sh
ExecStart=/opt/solr/bin/solr start
ExecStop=/opt/solr/bin/solr stop
Restart=on-failure
RestartSec=15s
TimeoutStopSec=30s
[Install]
WantedBy=multi-user.target graphical.target dovecot.service

If you don't use systemd disregard - but see if any of the above applies 
for your setup.

Let me know what happens.  I agree this can be a mortal pain to setup - 
but it's worth it.

Daniel

On 12/21/2018 4:33 AM, Joan Moreau wrote:
>
> Dear Daniel.
>
> Thank you for your kind reply.
>
> Regarding NFS, no, there is nothing like this in my setup.
>
> Deleteing SOLR and recreating it, I did it so  many times already.
>
> I started with *your* setup in the first place, as FTS_squat (which 
> actually works very well and very straightforward, I have no clue why 
> going for SOlr which is just a pain and not maintaining squat), and it 
> leads to totally funny results (for instance, I type "emirates" in my 
> "Air Companies" subfolder and get a lot of results .. but of competing 
> companies :D )
>
> I added the fts_enforce following AKi advice.
>
> I removed fts_decoder for the time being.
>
> I don't know where to go now. Dovcot still returning errors and SOlr 
> still companinig with "Out of range" and other Java errors.
>
> Bottom line, I am back to squat, but as it is not maintained so 
> crashed also times to times.
>
>
> I think we should discuss on
>
> (1) Why the damn choice of Solr has been main. As you empahised, 
> maintainend so many independent software is a pain
>
> (2) If there is a real reason why going for SOlr, how to have a 
> working (i.e. getting the right results to the end user) setup ?
>
> (3) If there iare no tangible reason, what about maintaining fts_squat 
> , which did the job nicely for years and no complains about.
>
>
>
>
>
> On 2018-12-16 08:51, Daniel Miller via dovecot wrote:
>
>> Joan,
>>
>> I understand and sympathize with your frustration - trying to get 
>> multiple applications to work together, particularly given the lack 
>> of documentation for some of them, can be extremely challenging.  
>> That said, I suggest you consider an alternative viewpoint.  
>> Frequently being misunderstood myself I apologize in advance if I'm 
>> reading you wrong - but it appears your view towards the situation is 
>> there is a bug in Dovecot related to this problem.  That may well be 
>> - but I generally approach these matters from the assumption that *I* 
>> made the error in configuration and go from there.  I'm not an 
>> official rep for any product nor claim to be any form of expert in 
>> these matters - but I do have a working setup and I'd like to help 
>> you if I can.  If you're willing to - take a deep breath and let's 
>> try starting over.
>>
>> Looking back through your emails there were two items that stood out 
>> - your Dovecot config has two settings I don't use: "fts_decoder" and 
>> "fts_enforced".  I also asked you earlier whether or not NFS is 
>> involved here and I didn't see an answer - please clarify.
>>
>> I suggest you try once more: delete Solr completely. Re-install per 
>> the directions and use *my* managed-schema. Also comment out the 
>> Dovecot directives for "fts_decoder" and "fts_enforced" so you're 
>> closer to my setup.  Try running again and then post back - I'll do 
>> what I can.  Based on the fact that Dovecot+Solr 7.5+my schema is 
>> working for me leads me to believe we can get it working for you as well.
>>
>> Daniel
>>
>> On 12/15/2018 2:42 PM, Joan Moreau wrote:
>>>
>>> here my latest schema.xml (remove the "long" type hich seems to be 
>>> very deprecated in 7.x)
>>>
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <schema name="dovecot" version="2.0">
>>> <uniqueKey>id</uniqueKey>
>>> <types>
>>> <fieldType name="string" class="solr.StrField" />
>>> <fieldType name="gjlong" class="solr.LongPointField" 
>>> positionIncrementGap="0" />
>>> <fieldType name="gjtext" class="solr.TextField" 
>>> autoGeneratePhraseQueries="true" positionIncrementGap="100">
>>> <analyzer type="index">
>>> <tokenizer class="solr.StandardTokenizerFactory"/>
>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" 
>>> ignoreCase="true"/>
>>> <filter class="solr.WordDelimiterGraphFilterFactory" 
>>> generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" 
>>> splitOnNumerics="1" catenateWords="1" catenateNumbers="1" 
>>> catenateAll="1"/>
>>> <filter class="solr.FlattenGraphFilterFactory"/> <!-- required on 
>>> index analyzers after graph filters -->
>>> <filter class="solr.LowerCaseFilterFactory"/>
>>> <filter class="solr.NGramFilterFactory" minGramSize="3" 
>>> maxGramSize="15" />
>>> <filter class="solr.KeywordMarkerFilterFactory" 
>>> protected="protwords.txt"/>
>>> <filter class="solr.PorterStemFilterFactory"/>
>>> </analyzer>
>>> <analyzer type="query">
>>> <tokenizer class="solr.StandardTokenizerFactory"/>
>>> <filter class="solr.SynonymGraphFilterFactory" expand="true" 
>>> ignoreCase="true" synonyms="synonyms.txt"/>
>>> <filter class="solr.FlattenGraphFilterFactory"/> <!-- required on 
>>> index analyzers after graph filters -->
>>> <filter class="solr.StopFilterFactory" words="stopwords.txt" 
>>> ignoreCase="true"/>
>>> <filter class="solr.WordDelimiterGraphFilterFactory" 
>>> generateWordParts="1" generateNumberParts="1" splitOnCaseChange="1" 
>>> splitOnNumerics="1" catenateWords="1" catenateNumbers="1" 
>>> catenateAll="1"/>
>>> <filter class="solr.LowerCaseFilterFactory"/>
>>> <filter class="solr.NGramFilterFactory" minGramSize="3" 
>>> maxGramSize="15" />
>>> <filter class="solr.KeywordMarkerFilterFactory" 
>>> protected="protwords.txt"/>
>>> <filter class="solr.PorterStemFilterFactory"/>
>>> </analyzer>
>>> </fieldType>
>>> </types>
>>> <fields>
>>> <field name="_version_" type="string" indexed="true" stored="true"/>
>>> <field name="bcc" type="string" indexed="false" stored="false"/>
>>> <field name="body" type="gjtext" indexed="true" stored="false"/>
>>> <field name="box" type="string" indexed="true" required="true" 
>>> stored="true"/>
>>> <field name="hdr" type="gjtext" indexed="false" stored="false"/>
>>> <field name="cc" type="gjtext" indexed="true" stored="false"/>
>>> <field name="from" type="gjtext" indexed="true" stored="false"/>
>>> <field name="id" type="string" indexed="true" required="true" 
>>> stored="true"/>
>>> <field name="subject" type="gjtext" indexed="true" stored="false"/>
>>> <field name="to" type="gjtext" indexed="true" stored="false"/>
>>> <field name="uid" type="string" indexed="true" required="true" 
>>> stored="true"/>
>>> <field name="user" type="string" indexed="true" required="true" 
>>> stored="true"/>
>>> </fields>
>>> </schema>
>>>
>>>
>>>
>>> On 2018-12-15 20:54, Joan Moreau wrote:
>>>
>>>     Daniel,
>>>     I have done that so any times (deleteing the data folders,
>>>     recreating the instance, restarting etc...)
>>>     But this is really not the issue
>>>     The issue is
>>>     1 - fts_solr reports errors in the log file (this is a pure
>>>     dovecot issue) : how to have much more details on what fts_solr
>>>     sends to Slor server and what does it returns ?
>>>     2 - Solr returns properly for a few hours, then starts crashing
>>>     or responding non-sense after some time
>>>     Additionally, is there a doc of fts-squat in order to adjust the
>>>     code to new releases of dovect ?
>>>
>>>     On December 12, 2018 4:44:10 PM Daniel Miller via dovecot
>>>     <dovecot at dovecot.org> wrote:
>>>
>>>         On 12/11/2018 4:46 AM, Joan Moreau via dovecot wrote:
>>>
>>>             I shared the errors already so many times (check this
>>>             mailinling for "solr" in teh title)
>>>
>>>             Contrary to what you say, with SOlr 7.5 and Dovecot
>>>             git,  I had to remove the "managed-schema" to make solr
>>>             respond a bit properly. It relies on schema.xml
>>>
>>>             In order to create the instance, no, it copies  the
>>>             default config in the dovecot instance.
>>>
>>>         I'm not a Solr expert by any means but I believe you are
>>>         incorrect.
>>>
>>>         As of Solr 5.x the managed-schema file is the primary method
>>>         for configuration.  The method I detailed previously for
>>>         setting up a config helps automate creating new Solr
>>>         instances - but as I stated you can either setup a Solr
>>>         template and then create the instance from that or create an
>>>         instance using the default template and then adjust it.
>>>
>>>         The part that you *must* do after creating from the default
>>>         template is stop the server, delete the entire
>>>         "<prefix>/solr/dovecot/data" folder, then install the
>>>         correct managed-schema file, then restart the server.  The
>>>         server will not function with mismatched schema/data.
>>>
>>>         If you'll try that - explicitly "rm -rf
>>>         <prefix>/solr/dovecot/data", copy the managed-schema file
>>>         into the conf folder, and restart - things will either work
>>>         or there's something else that needs correction.
>>>
>>>         --
>>>         Daniel
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://dovecot.org/pipermail/dovecot/attachments/20181221/289f7507/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: solr-server-solr-configsets-dovecot-conf.bz2
Type: application/octet-stream
Size: 62841 bytes
Desc: not available
URL: <https://dovecot.org/pipermail/dovecot/attachments/20181221/289f7507/attachment-0001.obj>


More information about the dovecot mailing list