You are here

opsview-monit not running

9 posts / 0 new
Last post
Josef Rieder
JosefRieder's picture
opsview-monit not running

Dear all,

I'm running Opsview Atom 5.2 on Ubuntu 16.04 server (virtualized) (after upgrade from 5.0 & 5.1)

unfortunately can't get opsview-monit up and running:

I get following errors:

1) email messages:

PROBLEM: Opsview Watchdog Process is CRITICAL on host Hyp1Mgmt_opsview
Service: Opsview Watchdog Process
Host: Hyp1Mgmt_opsview
Alias: Opsview Master Server
Address: localhost
Host Group Hierarchy: Opsview > Monitoring Servers
State: CRITICAL
Date & Time: Sat Dec 24 05:38:10 CET 2016
Keywords: opsview-components

Additional Information:

PROCS CRITICAL: 0 processes with UID = 0 (root), command name 'opsview-monit'

2) summary output from opsview-monit attached

myserver@myserver:~$ sudo /opt/opsview/watchdog/bin/opsview-monit summary
Monit: the monit daemon is not running

myserver@myserver:~$ sudo /opt/opsview/watchdog/bin/opsview-monit validate
'opsviewnfd' process is not running
'opsviewnfd' trying to restart
'opsviewnfd' start: /usr/bin/sudo
myserver@myserver:~$

myserver@myserver:/var/log/opsview$ sudo /opt/opsview/watchdog/bin/opsview-monit validate
'opsviewnfd' process is not running
'opsviewnfd' trying to restart
'opsviewnfd' start: /usr/bin/sudo
myserver@myserver:/var/log/opsview$

 

Running

sudo /opt/opsview/watchdog/bin/opsview-monit restart all

does not help either

 

I tried already the following troubleshooting part in opsview guide:

https://knowledge.opsview.com/articles/opsview-monitor-501/37-troublesho...

root@ov-author:~# /opt/opsview/watchdog/bin/opsview-monit summary
Status not available -- the monit daemon is not running

pkill -u opsview
pkill -u nagios
service opsview-watchdog start
/opt/opsview/watchdog/bin/opsview-monit start all

But still no success

Did anybody ancounter similar issues?

Any best practice for Ubuntu 16.04 (mybe also in conjunction with systemd migration )

Best regards and thanks for comments

 

Duncan Ferguson
dferguson's picture
What happens when you run '

What happens when you run '/etc/init.d/opsview-monit start' as root, or what errors did you get when you ran 'service opsview-watchdog start' as root?

This starts the opsview-monit daemon that the /opt/opsview/watchdog/bin/opsview-monit script interacts with.

  Duncs

Josef Rieder
JosefRieder's picture
opsview service not available any more

After Reboot opsview is not starting at all

/etc/init.d/opsview-monit is not available
also no systemd entry is available for any opsview

These files might have been somehow not available any more after upgrade to Ubuntu 16.04 (as there was also an upgrade to systemd within this release - this is just my guess now)

How can I have the service process back in systemd?

See detailed infos below:

myserver@myserver:~$ sudo /opt/opsview/watchdog/bin/opsview-monit summary
[sudo] Passwort für myserver:
Monit: the monit daemon is not running
myserver@myserver:~$ sudo /opt/opsview/watchdog/bin/opsview-monit start all
'opsview-web' start: /usr/bin/sudo
'opsview-timeseriesrrdupdates' start: /usr/bin/sudo
'opsview-timeseriesrrdqueries' start: /usr/bin/sudo
'opsview-timeseriesenqueuer' start: /usr/bin/sudo
'opsview-timeseries' start: /usr/bin/sudo
'nagios' start: /usr/bin/sudo
'opsviewd' start: /usr/bin/sudo
'opsviewhd' start: /usr/bin/sudo
'opsview-agent' start: /usr/bin/sudo
'import_ndoconfigend' start: /usr/bin/sudo
'import_ndologsd' start: /usr/bin/sudo
'import_perfdatarrd' start: /usr/bin/sudo
'opsviewadmd' start: /usr/bin/sudo
'nsca' start: /usr/bin/sudo
'nrd' start: /usr/bin/sudo
'opsviewnfd' start: /usr/bin/sudo

myserver@myserver:~$ sudo /opt/opsview/watchdog/bin/opsview-monit summary
Monit: the monit daemon is not running

myserver@myserver:~$ sudo systemctl status opsview-watchdog
? opsview-watchdog.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)

myserver@myserver:~$ cd /etc/init.d/
myserver@myserver:/etc/init.d$ ls
acpid gdomap networking rsyslog
apache2 geneweb ondemand screen-cleanup
apache-htcacheclean grub-common plymouth screen-cleanup.dpkg-new
apparmor halt plymouth-log sendsigs
apport hostname.sh postfix single
atd hwclock.sh postgresql skeleton
binfmt-support irqbalance pppd-dns snmpd
bootmisc.sh keyboard-setup procps sogo
cgmanager killprocs puppet ssh
cgproxy kmod puppetmaster tftpd-hpa
checkfs.sh lvm2 puppetqd thermald
checkroot-bootclean.sh lvm2-lvmetad qemu-kvm udev
checkroot.sh lvm2-lvmpolld qemu-kvm.dpkg-bak ufw
console-setup memcached rc umountfs
cron mountall-bootclean.sh rc.local umountnfs.sh
dbus mountall.sh rcS umountroot
dns-clean mountdevsubfs.sh README urandom
drbd mountkernfs.sh reboot uuidd
foreman mountnfs-bootclean.sh resolvconf webmin
foreman-proxy mountnfs.sh rpcbind whoopsie
ganeti mysql rsync x11-common

myserver@myserver:/etc/systemd$ sudo service opsview-watchdog status
? opsview-watchdog.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)

Duncan Ferguson
dferguson's picture
This sounds like the upgrade

This sounds like the upgrade went wrong.

Can you reinstall all opsview packages?  This will not lose any configuration as it is all stored within the MySQL database.

'aptitude reinstall opsview'

If 'service opsview-watchdog status' does not then work, reinstall all packages by name:

'aptitude reinstall opsview opsview-base opsview-core opsview-perl opsview-setup opsview-watchdog opsview-web'

  Duncs

Josef Rieder
JosefRieder's picture
reinstall not successfull

I tried now reinstall with following apt-get command (aptitude not installed on my server):

sudo apt-get install --reinstall opsview

This did not really succeed, see below

hyp1mgmt@hyp1mgmt:~$ sudo apt-get install --reinstall opsview
.........
Entpacken von opsview (5.2.1.163061249-1xenial1) über (5.2.1.163061249-1xenial1) ...
opsview (5.2.1.163061249-1xenial1) wird eingerichtet ...
hyp1mgmt@hyp1mgmt:~$

myserver@myserver:~$ sudo service opsview-watchdog start
Failed to start opsview-watchdog.service: Unit opsview-watchdog.service not found.

Then reinstall with all packages as suggested.
Does not succeed either. It seems there is no connection to "upstart" possible

See e.g. following line(s):
initctl: Verbindung zu Upstart nicht möglich: Failed to connect to socket /com/ubuntu/upstart: Verbindungsaufbau abgelehnt
English Version:(manually translated)
initctl: Connection to Upstart not possible: Failed to connect to socket /com/ubuntu/upstart: Connection not permitted

Complete log cannot be included (as always the spam-filter in this blog is triggered):

myserver@myserver:~$ sudo apt-get install --reinstall opsview opsview-base opsview-core opsview-perl opsview-setup opsview-watchdog opsview-web
.......
Es müssen noch 36,5 MB von 36,5 MB an Archiven heruntergeladen werden.
Nach dieser Operation werden 0 B Plattenplatz zusätzlich benutzt.
.......
Starting Monit 5.18 daemon with http interface at /opt/opsview/watchdog/run/monit.socket
.
Stopping opsview...
sudo: Kein TTY vorhanden und kein »askpass«-Programm angegeben
done
initctl: Verbindung zu Upstart nicht möglich: Failed to connect to socket /com/ubuntu/upstart: Verbindungsaufbau abgelehnt
insserv: warning: script 'screen-cleanup' missing LSB tags and overrides
insserv: Default-Start undefined, assuming empty start runlevel(s) for script `screen-cleanup'
insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `screen-cleanup'
initctl: Verbindung zu Upstart nicht möglich: Failed to connect to socket /com/ubuntu/upstart: Verbindungsaufbau abgelehnt
insserv: warning: script 'screen-cleanup' missing LSB tags and overrides
insserv: Default-Start undefined, assuming empty start runlevel(s) for script `screen-cleanup'
insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `screen-cleanup'
Waiting for opsviewmd service to stop
Removing Nagios crontab entries
Unmonitoring opsview-core before stop

Stopping opsview...
nrd not running
nsca not running
opsviewd not running
nagios not running
import_ndologsd not running
import_perfdatarrd not running
import_ndoconfigend not running
opsviewadmd not running
opsviewhd not running
opsviewmd not running
done
initctl: Verbindung zu Upstart nicht möglich: Failed to connect to socket /com/ubuntu/upstart: Verbindungsaufbau abgelehnt
insserv: warning: script 'screen-cleanup' missing LSB tags and overrides
insserv: Default-Start undefined, assuming empty start runlevel(s) for script `screen-cleanup'
insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `screen-cleanup'
initctl: Verbindung zu Upstart nicht möglich: Failed to connect to socket /com/ubuntu/upstart: Verbindungsaufbau abgelehnt
insserv: warning: script 'screen-cleanup' missing LSB tags and overrides
insserv: Default-Start undefined, assuming empty start runlevel(s) for script `screen-cleanup'
insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `screen-cleanup'
Waiting for opsview-core services to stop
Entpacken von opsview-core (5.2.1.163061249-1xenial1) über (5.2.1.163061249-1xenial1) ...
Vorbereitung zum Entpacken von .../opsview_5.2.1.163061249-1xenial1_amd64.deb ...
Entpacken von opsview (5.2.1.163061249-1xenial1) über (5.2.1.163061249-1xenial1) ...
Vorbereitung zum Entpacken von .../opsview-watchdog_1.0.0.162461457-1xenial1_amd64.deb ...
Failed to stop opsview-watchdog.service: Unit opsview-watchdog.service not loaded.
dpkg: Warnung: Unterprozess altes pre-removal-Skript gab den Fehlerwert 5 zurück
dpkg: stattdessen wird Skript aus dem neuen Paket probiert ...
prerm called with unknown argument 'failed-upgrade'
dpkg: Fehler beim Bearbeiten des Archivs /var/cache/apt/archives/opsview-watchdog_1.0.0.162461457-1xenial1_amd64.deb (--unpack):
Unterprozess neues pre-removal-Skript gab den Fehlerwert 1 zurück
Vorbereitung zum Entpacken von .../opsview-web_5.2.1.163061249-1xenial1_amd64.deb ...
Unmonitoring opsview-web before remove

Stopping opsview-web: done
initctl: Verbindung zu Upstart nicht möglich: Failed to connect to socket /com/ubuntu/upstart: Verbindungsaufbau abgelehnt
insserv: warning: script 'screen-cleanup' missing LSB tags and overrides
insserv: Default-Start undefined, assuming empty start runlevel(s) for script `screen-cleanup'
insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `screen-cleanup'
Unmonitoring opsview-web before upgrade

Not running
initctl: Verbindung zu Upstart nicht möglich: Failed to connect to socket /com/ubuntu/upstart: Verbindungsaufbau abgelehnt
insserv: warning: script 'screen-cleanup' missing LSB tags and overrides
insserv: Default-Start undefined, assuming empty start runlevel(s) for script `screen-cleanup'
insserv: Default-Stop undefined, assuming empty stop runlevel(s) for script `screen-cleanup'
Entpacken von opsview-web (5.2.1.163061249-1xenial1) über (5.2.1.163061249-1xenial1) ...
Fehler traten auf beim Bearbeiten von:
/var/cache/apt/archives/opsview-watchdog_1.0.0.162461457-1xenial1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
myserver@myserver:~$

Duncan Ferguson
dferguson's picture
Problem has been resolved.

Problem has been resolved.

Because opsview-watchdog package was in a strange state ('dpkg -L opsview-watchdog' showed /etc/init.d/opsview-watchdog should be on the filesystem, but it wasnt) had to remove and reinstall the package.

However, since the daemon could not be stopped cleanly by the preremove script, it could not be reinstalled.

Modified the /var/lib/dpkg/opsview-watchdog.prerm script to return 'true' in the install_or_upgrade method, then reran the reinstall command and the problem fixed itself.

Duncs

csterley
csterley's picture
Any chance you can eleborate

Any chance you can eleborate on the change you did?  Having the exact same issue in our dev environment right now.

Duncan Ferguson
dferguson's picture
In essence, I editted /var

In essence, I editted /var/lib/dpkg/opsview-watchdog.prerm and put an 'exit 0' on the second line so the script did not try to perform any of its usual actions

This allowed the normal removal process to work cleanly when running the dpkg/apt-get/aptitude commands, which then allowed the watchdog package to be installed and complete the upgrade process.

  Duncs

csterley
csterley's picture
That did the trick.   Thanks

That did the trick.

 

Thanks