DWH analytics problem

Hello,

As expected, after many happy months, my DWH analytics stopped working and my plays don’t get updated anymore.

The problem is that I don’t get any errors at all on /opt/kaltura/dwh/logs and I don’t have any locks.

I already ran call kalturadw.add_partitions(); but didn’t help.

I would like to reprocess the apache logs but I don’t know what temporary kalturadw tables should I empty before reprocessing.

Thank you very much for any help,

David Eusse

Hello,

I let the processes run by themselves and this is what I’m getting now:

ERROR 10-03 23:00:06,614 - Update Dimensions - A serious error occurred during job execution: org.pentaho.di.core.exception.KettleJobException:
Unable to begin processing by logging start in logtable etl_log

Unable to write log record to log table etl_log

Error inserting/updating row.

I can’t find the log table so any help will be appreciated,

David Eusse

Hello, @david.eusse

Normally, the “elt_log” table is in “kalturalog” database.

Regards

Thank you !
Just found it. I repaired it and I’m waiting for the next cron to run.

Regards,

David

Hello @t-saito,

Now I’m getting this error:

Unable to begin processing by logging start in logtable etl_log

Unable to write log record to log table etl_log

Error inserting/updating row
Deadlock found when trying to get lock; try restarting transaction

How do I restart the transaction ?
Regards,

David

Hello @t-saito,

Is it possible to empty this table so it starts again ?

Thank you for yoyr help,

David

Hello @david.eusse

Please see “locks” table in “kalturadw_ds” database.
When any cron does not work, “lock_state” field of each record must be set to 0.
If “lock_state” is set to 1 when no ETL process is working, ETL processes will not work well.

Regards

Thank you @t-saito but I checked and there are no locks. It’s something else.

I’l try to empty the etl_log table and see how it works.

Regards,

David

Hello @david.eusse

Error message “Deadlock found when trying to get lock” is posted by database system (MySQL / MariaDB).
If you face the same error again, please connect database console and execute the following command:

mysql> SHOW ENGINE INNODB STATUS;

And, please see “LATEST DETECTED DEADLOCK” block in results.
If there exists the block, you will find SQL statement that caused the deadlock.

Regards

Hello @t-saito

I ran mysql> SHOW ENGINE INNODB STATUS; but didn’t get any DEADLOCK results.

I got again this this morning:

ERROR 11-03 16:00:06,918 - Update Dimensions - A serious error occurred during job execution: org.pentaho.di.core.exception.KettleJobException:
Unable to begin processing by logging start in logtable etl_log

Unable to write log record to log table etl_log

Error inserting/updating row
Deadlock found when trying to get lock; try restarting transaction

I ran mysqlcheck and it’s fine.

My questions are:
Is there a way to empty some temp tables so I can reprocess the log files again ?
Can I empty the etl_log table ?

Thank you in advande for your help,

David Eusse

Hello @david.eusse

Normally, temporary tables are deleted when current session is closed.
So that, users cannot delete the temporary tables from other session.

I think that you can empty etl_log table when no process uses the table.

Regards

Hello @t-saito,
Hope you are doing well during these difficult times.
After a lot of trouble, I reinstalled kaltura-dwh. I don’t get any errors anymore but I still don’t get any plays views.

After debugging all steps, I noticed that the final triggers don’t do anything. When /opt/kaltura/app/alpha/scripts/dwh/dwh_plays_views_sync.sh runs, it connects to de database and exits.

I checked and all kalturadw tables are filled with data.

Any ideas ?

Thank you again for your help,

David

Hello @david.eusse,

In the KMCng, other statistics (for example, bandwidth usage) are updated according to users’ actions?
Or, any statistic value is not updated?
If other statistics are updated correctly, we faced a similar trouble in earlier version of the Kaltuta CE.

Hello @t-saito,

After a lot of work and making a clean dwh install, it started showing data again. The dwh_entry_plays_views and dwh_hourly_events_entry were not been filled and I don’t not why.

Now I have a question, if you don’t mind.

If I put back the apache log files into /opt/kaltura/web/logs it will recalculate the stats, won’t it ?.
I will do that little by little so it doesn’t disturb the cycles.

I clarify that in order to avoid a server lock, moved all the apache logs into another folder before performing the clean dwh install.

regards,

David

Hello @david.eusse,

If you put the apache log files in /opt/kaltura/web/logs, it will recalculate the stats.

Note that, kalturadw_ds.files table has a list of log files are processed in ETL cycles.
Before placing the log files in /opt/kaltura/web/logs folder, you must delete records for those files from the kalturadw_ds.files table.

Regards

Hello @t-saito

Thank you. That’s what I’l do.

My new setup worked (more or less). It procressed the files and got stuck updating only two entries. :frowning:

Unfortunately, no errors appear on my log files.

Any ideas ?

Regards,

David

Hello @david.eusse,

The log file size may be too large to process.
If the size of the log file is issue, splitting the old log file before placing it in the /opt/kaltura/web/logs folder will work.

For example, hostname-kaltura_apache_access.log-20200327-01.gz, hostname-kaltura_apache_access.log-20200327-02.gz, hostname-kaltura_apache_access.log-20200327-03.gz, and so on.

First, extract old log file.
Next, split the log file.
Next, compress each split log file.
Finally, place the compressed split log files in the /opt/kaltura/web/logs folder.

If splitting the old log files can be fortunate, please try modifying log rotation script and cron, in order to process new log files.

Regards

Hello @t-saito,

Thank you. I’l try and let you know.

Regards,

David

Hello @t-saito,

I finally managed to load and old backup made with xtrabackup and it started recording again !
I deleted all records from the files table but the days that didn’t process well still don’t show after putting the archived log files in /opt/kaltura/web/logs and reprocessing without errors.

Is there another table where these processed records are kept ?

Thank you for your help,

David

Hello @david.eusse,

Sorry, I don’t know anything more.
I think that there is no other table.
And, Old log files may have been added to the statistics on the day they were processed by ETL.

Regards