← Revision 23 as of 2007-01-25 18:25:27
Size: 2404
Comment: update todo
|
← Revision 24 as of 2007-06-16 14:13:41 →
Size: 7243
Comment: brought up to date
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= Status = | = Mississippi Network Monitoring = |
Line 4: | Line 4: |
http://chevy.personaltelco.net/cacti/ | https://chevy.personaltelco.net/cacti/ |
Line 8: | Line 8: |
Management interface at https://chevy.personaltelco.net/cacti/ |
= Contents = [[TableOfContents]] |
Line 14: | Line 16: |
Basic install of the latest version: {{{ metrix# remountrw metrix# useradd snmp metrix# cd / metrix# wget -O - http://chevy.personaltelco.net/snmpd.tar.bz2 | tar xvjf - metrix# remountro metrix# /etc/init.d/snmpd restart }}} If this is the first time snmpd has been installed, you also need to do: {{{ metrix# cd /var/lib metrix# cp -a snmp/ /ro/var/lib metrix# mv snmp/ /rw/var/lib metrix# ln -s /rw/var/lib/snmp/ . }}} Get snmpd and assoc_count from another metrix, or from the wiki below... {{{ naya$ scp snmpd.conf root@metrix-ed.mississippi:/etc/snmp naya$ scp assoc_count root@metrix-ed.mississippi:/usr/local/bin }}} |
=== SNMP === Now-a-days, the image we use for metrixes, which is based on PyramidLinux and maintained by RussellSenior, contains snmpd natively as well as about 10 "exec"-style custom exports: || What? || OID || snmpd.conf line || || Local Coverage (ath0) assocation count || 1.3.6.1.4.1.2021.8.1.101.1 || exec assoc_count /usr/local/bin/assoc_count || || Upstream Link Loss || 1.3.6.1.4.1.2021.8.1.101.2 || exec link-loss /usr/local/bin/get-value.sh backhaul loss || || Upstream Link Ping Trials || 1.3.6.1.4.1.2021.8.1.101.3 || exec link-trials /usr/local/bin/get-value.sh backhaul ping-trials || || Upstream Link Ping Successes || 1.3.6.1.4.1.2021.8.1.101.4 || exec link-success /usr/local/bin/get-value.sh backhaul ping-success || || Upstream Link Latency Min || 1.3.6.1.4.1.2021.8.1.101.5 || exec link-latency-min /usr/local/bin/get-value.sh backhaul latency-min || || Upstream Link Latency Ave || 1.3.6.1.4.1.2021.8.1.101.6 || exec link-latency-ave /usr/local/bin/get-value.sh backhaul latency-ave || || Upstream Link Latency Max || 1.3.6.1.4.1.2021.8.1.101.7 || exec link-latency-max /usr/local/bin/get-value.sh backhaul latency-max || || Upstream Link RSSI Min || 1.3.6.1.4.1.2021.8.1.101.8 || exec link-rssi-min /usr/local/bin/get-value.sh backhaul rssi-min || || Upstream Link RSSI Ave || 1.3.6.1.4.1.2021.8.1.101.9 || exec link-rssi-ave /usr/local/bin/get-value.sh backhaul rssi-ave || || Upstream Link RSSI Max || 1.3.6.1.4.1.2021.8.1.101.10 || exec link-rssi-max /usr/local/bin/get-value.sh backhaul rssi-max || And, before you point out that this would be better if we used "extend" instead of "exec": we are running net-snmpd 5.1.2, which is before "extend" was added... For more information on these exec scripts, see the bottom of this page... === Cacti === |
Line 42: | Line 39: |
* Add the device to Cacti with the Metrix Box template. * Create the Associated Stations, ath0, ath1...athN graphs. * Add the device to the main graph tree. * Add the Assoc. STAs graph to the Assoc STAs page. |
* Add the device to Cacti with the "PTP MGP Metrix" template. * Create some graphs, use all the templated ones, and for interface stats, ath0...athN are most useful, as well as eth0. * Add the device to the main graph tree (under MGP/Rooftop Metrixes). * Add the assoc_count_exec data source to the "Combined Associations" graph following the others as an example... |
Line 49: | Line 46: |
The WGTs are using the ipkg repository {{{ http://www.personaltelco.net/~russell/kamikaze/r3291/packages/ }}} and have the {{{snmpd}}} package (with its dependencies) installed. They use the same {{{snmpd.conf}}} and {{{assoc_count}}} script as the metrixes, so they look pretty much the same to cacti. |
TODO: Fill me in with correct information! = Scripts = == assoc_count.sh == {{{ #!/bin/sh echo $(($(wc -l < /proc/net/madwifi/ath0/associated_sta)/3)) }}} == get-value.sh == {{{ #!/bin/sh LINK=${1:-backhaul} VALUE=${2:-loss} DIR=/tmp/linkstats TARGET=${DIR}/${LINK}-${VALUE} if [ ! -f ${TARGET} ] || [ $(expr $(date +%s) "-" $(date -r ${TARGET} +%s)) -ge 60 ]; then /usr/local/bin/compute-stats.sh ${LINK} fi cat ${TARGET} }}} == monitor-link.sh == {{{ #!/bin/sh # grab information for link monitoring DESTIP=10.11.104.2 DESTNAME=backhaul IFACE=ath3 INTERVAL=500 # in centiseconds OUTDIR=/tmp/linkstats centiseconds () { awk '{ printf("%ld\n", $1 * 100) }' /proc/uptime } mkdir -p -m 777 ${OUTDIR} start=$(centiseconds) end=$start while true; do end=$(expr ${end} "+" ${INTERVAL}) latency=$(ping -c 1 -i 5 -w 4 -q ${DESTIP} | sed -n -r -e 's|^rtt min/avg/max/mdev = ([0-9.]*)/.*|\1|p') if [ "${latency}" != "-" ]; then rssi=$(awk 'NR == 1 { level = $1 } NR == 2 { noise = $1 } END { print level - noise }' /sys/class/net/ath2/wireless/level /sys/class/net/ath2/wireless/noise) else latency="" rssi=0 fi now=$(centiseconds) echo $now $latency $rssi >> ${OUTDIR}/${DESTNAME} sleep $(expr $(expr ${end} "-" ${now}) "/" 100) done }}} == link-stats.sh == {{{ #!/bin/sh INPUT=/tmp/linkstats/backhaul if [ ! -f ${INPUT} ]; then echo 0 0 0 0 0 0 0 0 0; exit 0; fi /bin/mv ${INPUT} ${INPUT}-computing /bin/awk 'BEGIN { min_latency = 5.0 ; max_latency = 0.0; min_rssi = 100 ; max_rssi = 0 } NF == 2 { n_trials++ ; next } NF == 3 { latency = $2 ; rssi = $3 ; sum_latency += latency ; sum_rssi += rssi ; n_trials++ ; n_success++ } latency < min_latency { min_latency = latency } latency > max_latency { max_latency = latency } rssi < min_rssi { min_rssi = rssi } rssi > max_rssi { max_rssi = rssi } #{ print "debug", n_trials, n_success, latency, min_latency, sum_latency, max_latency, rssi, min_rssi, sum_rssi, max_rssi } END { if (n_trials == 0) { print 0,0,0,0,0,0,0,0,0 } else { printf("%.3f %d %d", (n_trials - n_success)/n_trials,n_success,n_trials); if (n_success == 0) { print "",0,0,0,0,0,0 } printf(" %.3f %.3f %.3f %d %.1f %d\n", min_latency, sum_latency / n_success, max_latency, min_rssi, sum_rssi / n_success, max_rssi) } }' ${INPUT}-computing rm -f ${INPUT}-computing exit 0 }}} == /etc/init.d/linkstats == {{{ #! /bin/sh # # skeleton example file to build /etc/init.d/ scripts. # This file should be used to construct scripts for /etc/init.d. # # Written by Miquel van Smoorenburg <miquels@cistron.nl>. # Modified for Debian GNU/Linux # by Ian Murdock <imurdock@gnu.ai.mit.edu>. # # Version: @(#)skeleton 1.9.1 08-Apr-2002 miquels@cistron.nl # PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin DAEMON=/usr/local/bin/monitor-link.sh NAME=monitor-link DESC="link quality measurement" test -x $DAEMON || exit 0 set -e case "$1" in start) echo -n "Starting $DESC: $NAME" start-stop-daemon --start -b --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "." ;; stop) echo -n "Stopping $DESC: $NAME " start-stop-daemon --stop --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "." ;; restart|force-reload) # # If the "reload" option is implemented, move the "force-reload" # option to the "reload" entry above. If not, "force-reload" is # just the same as "restart". # echo -n "Restarting $DESC: $NAME" start-stop-daemon --stop --quiet --pidfile \ /var/run/$NAME.pid --exec $DAEMON sleep 1 start-stop-daemon --start -b --quiet --pidfile \ /var/run/$NAME.pid --exec $DAEMON echo "." ;; *) N=/etc/init.d/$NAME # echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2 echo "Usage: $N {start|stop|restart|force-reload}" >&2 exit 1 ;; esac exit 0 }}} |
Line 56: | Line 218: |
/usr/local/bin/assoc_count is a homebrew script. It reports the number of associations on ath0. It only works with madwifi-ng. snmpd.conf has the line {{{ exec assoc_count /usr/local/bin/assoc_count }}} to enable it. This exports the association count on OID {{{1.3.6.1.4.1.2021.8.1.101.1}}}. Note that this OID is the standard for external exec'ed scripts and a second exec would use the same OID, except ending with a {{{.2}}}. Other than that, everything is standard net-snmp exports. The assoc_count script, is quite simple, it looks like: {{{ #!/bin/ash echo $((`wc -l < /proc/net/madwifi/ath0/associated_sta`/3)) }}} You can check that this is working remotely like: {{{ snmpget -c public -v 1 10.11.104.10 1.3.6.1.4.1.2021.8.1.101.1 }}} |
You can check that stuff is working remotely like: {{{ snmpget -c public -v 1 <ip> <oid> }}} Using any OID and IP you'd like. The ones in the tables on this page are worth testing... |
Line 73: | Line 228: |
* Uptime graphs * A (remote) backup strategy for mysql tables |
* A (remote) backup strategy for mysql tables and rrds |
Mississippi Network Monitoring
chevy is running Cacti, and monitoring all of the deployed metrixes and ciscos. The graphs are available at
Use username: guest, password: freewifirocks.
Contents
Set up
Metrix
SNMP
Now-a-days, the image we use for metrixes, which is based on PyramidLinux and maintained by RussellSenior, contains snmpd natively as well as about 10 "exec"-style custom exports:
What? |
OID |
snmpd.conf line |
Local Coverage (ath0) assocation count |
1.3.6.1.4.1.2021.8.1.101.1 |
exec assoc_count /usr/local/bin/assoc_count |
Upstream Link Loss |
1.3.6.1.4.1.2021.8.1.101.2 |
exec link-loss /usr/local/bin/get-value.sh backhaul loss |
Upstream Link Ping Trials |
1.3.6.1.4.1.2021.8.1.101.3 |
exec link-trials /usr/local/bin/get-value.sh backhaul ping-trials |
Upstream Link Ping Successes |
1.3.6.1.4.1.2021.8.1.101.4 |
exec link-success /usr/local/bin/get-value.sh backhaul ping-success |
Upstream Link Latency Min |
1.3.6.1.4.1.2021.8.1.101.5 |
exec link-latency-min /usr/local/bin/get-value.sh backhaul latency-min |
Upstream Link Latency Ave |
1.3.6.1.4.1.2021.8.1.101.6 |
exec link-latency-ave /usr/local/bin/get-value.sh backhaul latency-ave |
Upstream Link Latency Max |
1.3.6.1.4.1.2021.8.1.101.7 |
exec link-latency-max /usr/local/bin/get-value.sh backhaul latency-max |
Upstream Link RSSI Min |
1.3.6.1.4.1.2021.8.1.101.8 |
exec link-rssi-min /usr/local/bin/get-value.sh backhaul rssi-min |
Upstream Link RSSI Ave |
1.3.6.1.4.1.2021.8.1.101.9 |
exec link-rssi-ave /usr/local/bin/get-value.sh backhaul rssi-ave |
Upstream Link RSSI Max |
1.3.6.1.4.1.2021.8.1.101.10 |
exec link-rssi-max /usr/local/bin/get-value.sh backhaul rssi-max |
And, before you point out that this would be better if we used "extend" instead of "exec": we are running net-snmpd 5.1.2, which is before "extend" was added... For more information on these exec scripts, see the bottom of this page...
Cacti
Then, finish up by:
- Add the device to Cacti with the "PTP MGP Metrix" template.
- Create some graphs, use all the templated ones, and for interface stats, ath0...athN are most useful, as well as eth0.
- Add the device to the main graph tree (under MGP/Rooftop Metrixes).
- Add the assoc_count_exec data source to the "Combined Associations" graph following the others as an example...
WGTs
TODO: Fill me in with correct information!
Scripts
assoc_count.sh
echo $(($(wc -l < /proc/net/madwifi/ath0/associated_sta)/3))
get-value.sh
LINK=${1:-backhaul} VALUE=${2:-loss} DIR=/tmp/linkstats TARGET=${DIR}/${LINK}-${VALUE} if [ ! -f ${TARGET} ] || [ $(expr $(date +%s) "-" $(date -r ${TARGET} +%s)) -ge 60 ]; then /usr/local/bin/compute-stats.sh ${LINK} fi cat ${TARGET}
monitor-link.sh
# grab information for link monitoring DESTIP=10.11.104.2 DESTNAME=backhaul IFACE=ath3 INTERVAL=500 # in centiseconds OUTDIR=/tmp/linkstats centiseconds () { awk '{ printf("%ld\n", $1 * 100) }' /proc/uptime } mkdir -p -m 777 ${OUTDIR} start=$(centiseconds) end=$start while true; do end=$(expr ${end} "+" ${INTERVAL}) latency=$(ping -c 1 -i 5 -w 4 -q ${DESTIP} | sed -n -r -e 's|^rtt min/avg/max/mdev = ([0-9.]*)/.*|\1|p') if [ "${latency}" != "-" ]; then rssi=$(awk 'NR == 1 { level = $1 } NR == 2 { noise = $1 } END { print level - noise }' /sys/class/net/ath2/wireless/level /sys/class/net/ath2/wireless/noise) else latency="" rssi=0 fi now=$(centiseconds) echo $now $latency $rssi >> ${OUTDIR}/${DESTNAME} sleep $(expr $(expr ${end} "-" ${now}) "/" 100) done
link-stats.sh
INPUT=/tmp/linkstats/backhaul if [ ! -f ${INPUT} ]; then echo 0 0 0 0 0 0 0 0 0; exit 0; fi /bin/mv ${INPUT} ${INPUT}-computing /bin/awk 'BEGIN { min_latency = 5.0 ; max_latency = 0.0; min_rssi = 100 ; max_rssi = 0 } NF == 2 { n_trials++ ; next } NF == 3 { latency = $2 ; rssi = $3 ; sum_latency += latency ; sum_rssi += rssi ; n_trials++ ; n_success++ } latency < min_latency { min_latency = latency } latency > max_latency { max_latency = latency } rssi < min_rssi { min_rssi = rssi } rssi > max_rssi { max_rssi = rssi } #{ print "debug", n_trials, n_success, latency, min_latency, sum_latency, max_latency, rssi, min_rssi, sum_rssi, max_rssi } END { if (n_trials == 0) { print 0,0,0,0,0,0,0,0,0 } else { printf("%.3f %d %d", (n_trials - n_success)/n_trials,n_success,n_trials); if (n_success == 0) { print "",0,0,0,0,0,0 } printf(" %.3f %.3f %.3f %d %.1f %d\n", min_latency, sum_latency / n_success, max_latency, min_rssi, sum_rssi / n_success, max_rssi) } }' ${INPUT}-computing rm -f ${INPUT}-computing exit 0
/etc/init.d/linkstats
# # skeleton example file to build /etc/init.d/ scripts. # This file should be used to construct scripts for /etc/init.d. # # Written by Miquel van Smoorenburg <miquels@cistron.nl>. # Modified for Debian GNU/Linux # by Ian Murdock <imurdock@gnu.ai.mit.edu>. # # Version: @(#)skeleton 1.9.1 08-Apr-2002 miquels@cistron.nl # PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin DAEMON=/usr/local/bin/monitor-link.sh NAME=monitor-link DESC="link quality measurement" test -x $DAEMON || exit 0 set -e case "$1" in start) echo -n "Starting $DESC: $NAME" start-stop-daemon --start -b --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "." ;; stop) echo -n "Stopping $DESC: $NAME " start-stop-daemon --stop --quiet --pidfile /var/run/$NAME.pid \ --exec $DAEMON echo "." ;; restart|force-reload) # # If the "reload" option is implemented, move the "force-reload" # option to the "reload" entry above. If not, "force-reload" is # just the same as "restart". # echo -n "Restarting $DESC: $NAME" start-stop-daemon --stop --quiet --pidfile \ /var/run/$NAME.pid --exec $DAEMON sleep 1 start-stop-daemon --start -b --quiet --pidfile \ /var/run/$NAME.pid --exec $DAEMON echo "." ;; *) N=/etc/init.d/$NAME # echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2 echo "Usage: $N {start|stop|restart|force-reload}" >&2 exit 1 ;; esac exit 0
Diagnostics
You can check that stuff is working remotely like:
snmpget -c public -v 1 <ip> <oid>
Using any OID and IP you'd like. The ones in the tables on this page are worth testing...
TODO
- A (remote) backup strategy for mysql tables and rrds