Analytics not get results

Hello Jess!!!

Finally I made Everything correct!!! Thanks for all your help… I am going to explain in a few lines what was all the problems that I have and How I have solved them. I am going to do this for other people if they wolud have the same problem. I think That This notes (with the appropiater corrections by your side must be incluided in this post https://github.com/kaltura/platform-install-packages/blob/Lynx-12.11.0/doc/rpm-cluster-deployment-instructions.md or in this other post https://github.com/kaltura/platform-install-packages/blob/Lynx-12.11.0/doc/kaltura-packages-faq.md#analytics-issues).

ANALYTICS DOES NOT SHOW DATA (RPM CLUSTER KALTURA)!!!

Enviorement:

7 Nodes (one for each service) all nodes are Centos 6,5 deployed in ec2 instances in aws

  • FRONT
  • BATCH
  • MYSQL
  • INDEX
  • DWH
  • VOD
  • STREAMING

FLOW OF ANALYTICS (what makes /opt/kaltura/bin/kaltura-run-dwh.sh in a dew lines)*
1.- Open A standalone Page (if do not, the stats will not increasing) from the kmc entry, and hit the plays button
2.- The Front Node collect this info in the kaltura_apache_access_log
3.- Do the Logrotate IN THE FRONT NODE, this action will create a new file .gz of the access_log located in /opt/kaltura/web/logs.
4.- The previous file is which the dwh node takes, in order to create the stats
5.- Run he /opt/kaltura/bin/kaltura-run-dwh.sh script in dwh node. It takes the info of the file and load the tables in the kalturadw.database
6.- check the KMC console to see the plays and/or views increasing

**HOW TO SIMULATE AN ANALYTICS CYCLE
1.- Open the /opt/kaltura/bin/kaltura-run-dwh.sh in the dwh node, and comment the line about the logrotate (this is beacuse in a RPM cluster the logrotate must be done in the FRONT NODE)
2.- Edit /etc/logrotate.d/kaltura_apache in the FRONT NODE changing the line to “mv /opt/kaltura/log/kaltura_apache_access.log-/bin/date +%Y%m%d.gz /opt/kaltura/web/logs/hostname-kaltura_apache_access.log-/bin/date +%Y%m%d-%H%M%S.gz” (I have include %H%M%S at the end of the file) that is becuase to make differents gz files in the same hour and the dwh will proccess… If you do not change this… You could not be abkle to make more thant one test each hour Tiem. Because the dwh will think that this is the same file over and over
3.- Open A standalone Page (if do not, the stats will not increasing) from the kmc entry, and hit the plays button
4.- run logrotate -vvv -f /etc/logrotate.d/kaltura_apache in the FRONT NODE, this will create the file .gz and will empty the current kaltura_apache_access (take this in mind…) The problem colud come if you run again the dwh script hoping to collect the data but the data are not in the log file becuase after the last logrotate have been removed.
5.- Check if the .gz file is created under /opt/kaltura/web/logs from the FRONT NODE
6.- Check if the .gz file is created under /opt/kaltura/web/logs from the DWH NODE
7.- run /opt/kaltura/bin/kaltura-run-dwh.sh from the DWH NODE
8.- Check The KMC the plays and/or views must be increased

**POINTS TO CHECK
1.- When you install a rpm cluster you must have a NFS service to have shared files in all nodes except in index node under the path /opt/kaltura/web. Check that when you confugre the dwh Node in the path /opt/kaltura/web shared files exists a folder in the path /opt/kaltura/web/logs with permission 755 and owner root:root
2.- Check That all the Timezones in all the nodes of the enviorement are the same. In system time and in PHP time. Run “php -r 'echo date(“r”).”\n";’" and “date” from a sh console and check if they have the same output. If don’t, the mysql server will not process the data until its time would be more than the last event. Check too if the mysql daemon have the same TZ. run “select now()” from the mysql server
3.- After running the dwh script, you must check this tables at the kalturadw database in the mysql node.
a.- select * from kalturadw_ds.files; -> check if the gz file created in NFS, is the last record in the table, and write down the “file_id” of that record. (If not exist the last file, taht could be that you have not edited the logrotate.d/kaltura_apache file, and the server thinks that the file you are trying to proccess have already processed so it does nothing)
b.- Select * from kalturadw_ds.parameters; -> check dim_sync_last_update and if the date_value if corrected with the time you make the test
c.- select * from kalturadw.dwh_fact_events; -> Check if the events related to the entry you have hit the play are recorded in this table
d.- select * from dwh_entry_plays_views; -> Shows if all the data have been loaded to the plays and views

1 Like

Hi Juan,

You’re most very welcome and I’m glad to hear we’re good:)

Hi Jess

Unfortunately, i have to reopen this issue. When I have passed all the info i have collected to my production enviorement… I have crashed a lot.

The problem now is that it seems that the cycle does not start… When i see all the logs under /opt/kaltura/dwh/logs uin the DWH Node, the files exists but they are mostly empty… The events table is empty too so the plays_views_ too… Only the select * from kalturadw_ds.files seemes to work… The last file showed there is the file i have created in the logrotation. But the cycles stops ther.

I have checked all that you have tell me for the last 2 weeks. The timezone is the same in all nodes (UTC), the folder “logs” is created. I think is a sync problem. Maybe the system believes thatr the last cycle has no data… I tried to repeat but nothing happens and the logs are still empty.

In my develop envior3ement it works, but not in production… Same machines, same distro.

Is there a way to debug the hole cycle??.. I mean is it important to run all the scripts under run-dwh.sh file in the same order? what happens if I run only one script and I forget the others, in this case do I lose the sync??.

Note: In my prod enviorement the mysql server is an RDS system deployed in AWS… I shloud not have any problems should I?

Thanks