Kaltura-sphinx searchd is crashing with a error 11 (segmentation fault)

While troubleshooting an issue with the boot-up of kaltura, we noted the service start-up that kaltura-sphinx searchd is crashing with a error 11 (segmentation fault) during the last index phase.

index kaltura_base
{
type = rt
charset_type = utf-8
rt_mem_limit = 1024M

}

searchd
{
log = /opt/kaltura/log/sphinx/kaltura_sphinx_searchd.log
query_log = /opt/kaltura/log/sphinx/kaltura_sphinx_query.log
query_log_format = sphinxql
read_timeout = 5
max_children = 30
pid_file = /opt/kaltura/sphinx/searchd.pid
max_matches = 10000
preopen_indexes = 0
unlink_old = 1
workers = threads
binlog_path = /opt/kaltura/log/sphinx/data
binlog_flush = 1
rt_flush_period = 3600
listen = 0.0.0.0:9312:mysql41
}

Hi @astrava.

Sphinx 2.0.4 is very old. What version of the Kaltura Server are you running?

I recommend that you upgrade to the latest stable Kaltura Server version [13.14.0] which includes Sphinx 2.2.1-id64-dev (r4097).

Hi Jess,

Thank you for reply. What is your advice on upgrade. Any suggestions will be greatly appreciated.

packaged install

sphinx 2.2.1-id64-dev (r4097)
kaltura v 9. 18 (2014)
RHEL 6.9

Upgrade Kaltura

This will only work if the initial install was using this packages based install

I appreciate your help.
Dmitri

Hi @astrava,

There is a guide for upgrading from pre-RPM versions of CE here:


The process is not a fully automated one but the howto and scripts included in this directory should help you.

If it is not utterly crucial to preserve the original IDs [and we’re not just talking about entry IDs but also, partner IDs, UI conf IDs, etc], a faster and less error prone approach might be to export all your assets from the original server re-ingest them on the new one using the API.
See the discussion here:

If the searchd binary you currently have includes the debug symbols, you can try to generate a coredump and check the stack trace with GDB to understand why it crushes, I explain how to do that in this thread:

However, since you run a very old version, I cannot provide you with further support unless you upgrade.

Hi Jess,

We have two servers. The incident is on the server with following versions:
sphinx 2.2.1-id64-dev (r4097)
kaltura v 9. 18 (2014)

Any suggestions will be greatly appreciated.
I appreciate your help and your time.
Thanks,
Dmitri

Hi @astrava,

While many things changed since 9.18 and we no longer officially support it, the Sphinx version remained the same.
IMPORTNAT NOTE: This is not true for the older CE version you have with Sphinx 2.0.4 for which I can offer you no other solution but to upgrade.

With 9.18, you should be able to obtain the RPM package for kaltura-sphinx from the latest repo, extract the searchd binary, override your existing one and test with it.
Please be sure to only override the search binaries as per the instructions below and do not upgrade the whole package as it contains the kaltura-populate init script which also changed since your version.

You can do it like this:

# mkdir /tmp/new_sphinx
# cd /tmp/new_sphinx
# wget http://installrepo.origin.kaltura.org/repo/releases/13.15.0/6/RPMS/x86_64/kaltura-sphinx-2.2.1-23.x86_64.rpm
# rpm2cpio kaltura-sphinx-2.2.1-23.x86_64.rpm | cpio -idmv
# cp -r /opt/kaltura/sphinx/bin /opt/kaltura/sphinx/bin.orig
# service kaltura-sphinx stop
# cp ./opt/kaltura/sphinx/bin/* /opt/kaltura/sphinx/bin/
# service kaltura-sphinx start

If, after upgrading the binaries as described above, you’re still having issues, you’ll need to obtain the debug version and load it as per my instructions here:

then, provide a stack trace and we’ll see what we can do:)

Hi Jess, Thank you for reply. At the same time I got a reply from Sphinx suggesting to install a new version 2.2.11. Any suggestions will be greatly appreciated.
I appreciate your help and your time.
Thanks,
Dmitri

http://sphinxsearch.com/bugs/view.php?id=2676

2.2.1 is an aged version and a beta one to begin with…

Please try upgrading to, at the very least, 2.2.11, or 2.3.2, and ideally the latest 3.0.2, and see if a newer version works.
Simply replacing the indexer and searchd binaries should work for a test upgrade.
I’ll close the issue for now but feel free to reopen it if 3.0.2 is still crashing.

and

there must be a compatibility on the data. And if possible, the best way is to clean up the old data completely and re-index it from scratch. in the framework of one major version (and even most likely with a major version change), you can always just take and replace the old binaries with new ones.

Hi @astrava,

2.2.1 is indeed rather old but it’s the last one we tested with. I can’t promise you newer versions will work and considering the important role Sphinx plays in our platform, I wouldn’t recommend risking it. If it fails immediately and in an obvious manner, it’s easy to tell and revert but things may appear to be working well and fail quietly in some dark corner:)

I recommend you try taking the binaries from the latest kaltura-sphinx package as I explained in my last reply and, if it is still crushing, get a coredump so we can inspect the stack trace and gain a better understanding.
I also recommend that you rebuild all your indexes. See https://github.com/kaltura/platform-install-packages/blob/Mercury-13.16.0/RPM/scripts/postinst/kaltura-sphinx-reindex.sh. It may not work out of the box on your ENV but you can discern the required steps from reading it. The $APP_DIR/deployment/base/scripts/populateSphinx*.php scripts exist in your version as well.
Let me know if upgrading worked and if not, please post your BT.

Thanks,

Hi Jess,

I have updated the sphinx and I’ve run the re-index script with some warnings.
line 259: $key = $dataSources[$curIndex];
Please advise.
I appreciate your help and your time.
Thanks,
Dmitri

./sphinx_update.sh
searchd (pid 28291 19984) is running…
Shutting down monit: [ OK ]
Starting monit: Starting monit daemon with http interface at [*:2812]
[ OK ]
Stopping searchd: Sphinx 2.2.1-id64-dev (r4097)
Copyright © 2001-2013, Andrew Aksyonoff
Copyright © 2008-2013, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file ‘/opt/kaltura/app/configurations/sphinx/kaltura.conf’…
stop: successfully sent SIGTERM to pid 28291

Backing up files to /opt/kaltura/sphinx.bck.1521470705. Once the upgrade is done and tested, please remove this directory to save space
Starting searchd: Sphinx 2.2.1-id64-dev (r4097)
Copyright © 2001-2013, Andrew Aksyonoff
Copyright © 2008-2013, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file ‘/opt/kaltura/app/configurations/sphinx/kaltura.conf’…
listening on all interfaces, port=9312
WARNING: index ‘kaltura_base’: no fields configured (use rt_field directive) - NOT SERVING
WARNING: index ‘kaltura_kuser_base’: no fields configured (use rt_field directive) - NOT SERVING
precaching index ‘kaltura_entry’
precaching index ‘kaltura_category’
precaching index ‘kaltura_kuser’
precaching index ‘kaltura_category_kuser’
precaching index ‘kaltura_cue_point’
precaching index ‘kaltura_entry_distribution’
precaching index ‘kaltura_caption_item’
precaching index ‘kaltura_tag’
precached 8 indexes in 0.037 sec

Shutting down monit: [ OK ]
Starting monit: Starting monit daemon with http interface at [*:2812]
[ OK ]
PHP Notice: Undefined offset: 0 in /client/kaltura/app/alpha/apps/kaltura/lib/db/DbManager.php on line 259
PHP Notice: Undefined offset: 0 in /client/kaltura/app/alpha/apps/kaltura/lib/db/DbManager.php on line 259


kaltura_sphinx_searchd.log


[Mon Mar 19 10:45:04.864 2018] [28291] rt: index kaltura_tag: ramchunk saved in 0.009 sec
[Mon Mar 19 10:45:04.864 2018] [28291] shutdown complete
[Mon Mar 19 10:45:04.866 2018] [19984] Child process 28291 has been finished, exit code 0. Watchdog finishes also. Good bye!
[Mon Mar 19 10:45:05.382 2018] [31389] Child process 31390 has been forked
[Mon Mar 19 10:45:05.382 2018] [31390] listening on all interfaces, port=9312
[Mon Mar 19 10:45:05.383 2018] [31390] WARNING: index ‘kaltura_base’: no fields configured (use rt_field directive) - NOT SERVING
[Mon Mar 19 10:45:05.383 2018] [31390] WARNING: index ‘kaltura_kuser_base’: no fields configured (use rt_field directive) - NOT SERVING
[Mon Mar 19 10:45:05.420 2018] [31390] accepting connections