Perform Lucene Index schedule causing 100% CPU

Over the last two nights we have seen the “Perform Lucene Index” Scheduled work spawn a PHP process that consumes all available CPU until killed. Has anyone else suffered from this at all? Any suggestions?. For now, we have disabled the schedule.

Thanks, Richard

SuiteCRM Version 7.11.2
Running on Ubuntu 18.04.3 LTS
PHP version 7.2.19
MySQL version 14.15 Distrib 5.7.26

This will give you some more clues about what could be going on:

https://pgorod.github.io/Reindex-AOD/

I am not telling you to reindex (although that might be a solution) but have a look at the files, try to spot permissions problems, overgrown stuff, etc.

Also check both your logs.

Hi Thanks for the suggestions. I’d already checked the logs and there’s nothing odd in either of them.
I’ve cleared down the files and MySQL data and I’ll see how it runs tonight.

Ok. When you come back tomorrow with your report please include some extra info: how big is your database (a general idea), how big is your index on disk.

Also check your PHP resource parameters, the ones for CLI (not the ones for the web server).

So find the correct php.ini with

php -i | grep php.ini

and check things like

memory_limit
max_execution_time

1 Like

It did happen again last night and I had to kill the php process.
In answer to your questions…

/etc/php/7.2/cli/php.ini
memory_limit = -1
max_execution_time = 30

du -hs /var/lib/mysql/suitecrm - 720 MB
du -hs /var/www/html/modules/AOD_Index/Index/Index - 20M

This is a Live Production system. The php config has not changed. The only change that matches the time when this problem started
has been the deployment of two new custom modules. These have been running on a separate Development system with the same data
for a few weeks without a problem.

I would increase max_execution_time to 60 or 90. Then restart web server.

The thing that’s annoying me is not having a PHP error in the logs. This would really focus our diagnosis.

Do you have a /var/log/apache2/error.log in your system? Any clues there?

Sometimes Lucene indexes get corrupted. Maybe a full reindex could solve your problems.

Sorry but I don’t see the logic in in changing the max_execution_time, it’s not failing after 60 secs, it just keep running.
Also restarting the web server wont affect a php task that is running from cron.
There are no php errors because its not erroring, it just keeps running.
There wouldn’t be entries in the Apache log for a php task run from cron anyway.
As I mentioned before, I have already deleted the Index to force a re-index.

The question is, what would make it just keep running and never end?

I was just recommending that as a general server health option, and the restart was just to make the option effective. But you might be right maybe this is not necessary.

I guess I am also looking for other processes that could be breaking, not just the indexing itself, that could serve as clues. For example, you could have errors in your apache log when adding a record to a certain table with performance problems, since that will call into the indexing code also.

As I mentioned before, I have already deleted the Index to force a re-index.

Sorry, I missed that, I am on waaaaay too many threads here :blink:

Another suggestion I could give you is to run the first query on this post:

https://pgorod.github.io/Database-tables-size/

to get a sense if you have any overgrown tables. These might be generating abnormally slow queries. There is an option in SuiteCRM to “log slow queries” which would tell exactly which ones, but normally just a glance of the tables will intuitively show you if any problems exist. Note that indexing also uses a few database tables, not just the file system.

I’ve cleared down some data, although there is not much there to start with. we only have 4 users on this system so far.
I’ve re-organised the MySQL database and cleared the Index.
I’ll see how it goes tonight.

Happened again last night.
It does look like this has been provoked by the installation of our new Custom Module.
It’s not the first Custom Module ww have installed on the Live System and it was constructed in Module Builder.
So there’s no additional PHP code in there.

Have you tried “log slow queries”?

Can you use “top” or equivalent process manager program to try and figure out if the CPU is being consumed by the database or by PHP, or by something else?

From my original post - Over the last two nights we have seen the “Perform Lucene Index” Scheduled work spawn a PHP process that consumes all available CPU until killed.

Did you create any relationships between a module and itself?

I am also assuming you didn’t add any logic hooks, because of something you wrote up there (and amazingly I read it!)

Relationships to self?, no that would be insane :slight_smile:
No logic hooks, as I said “So there’s no additional PHP code in there.”

Actually I’ve often used a Contact-to-Contact relationship and it works very well. I was just wondering if there could be some edge-case where this would cause recursion, but normally it doesn’t.

I am running out of ideas about what could be happening… :huh:

If you can you reproduce the problem on demand, you can always try debugging through the code with an IDE, if you have such a setup (and skill). Or maybe even just a DEBUG-level log, watched live with “tail -f” could be enough to give you clues about which parts of the code are looping endlessly, if any.

I’m doing this work as part of a SuiteCRM setup for a client. They would rather the system performed well
than I spend more time on this. They just wont use the Global Search.

Hello.
I am having problems with a custom module as well.
It was working perfectly before 7.11.6 (or 5).

The CPU raises to 80% for apache and 20% for mysql and it generates a timeout after 60 seconds when I a simple call like this:

$mySupplierOrder = new SupplierOrders();
$mySupplierOrder->retrieve($id);

Nothing in SuiteCRM logs.
Only a timeout in Apache errors log.

It is hard for my team to work on this now.
Is it related to LUCENE ?

I’ve just had to turn off “Perform Lucerne Index” schedule.

Sorry. Problem not related to LUCENE, Thx