Some info on the refactoring work currently underway

So since last September and initially instigated by Tony (murrant) we’ve been working on refactoring a lot of the discovery, polling and alerting code base. Some of this has caused a bit of pain along the way but we are getting there with it. The cleanup started with OS discovery and the addition of unit testing so that we could ensure future changes wouldn’t break existing device support – something which has happened before. Along with that, other areas of improvement include:

– Improved sensor polling to make use of multiple snmp calls per sensor type rather than calling them singularly.
– A tidy up of all of the MIBs that we have including removing duplicates and moving vendor specific mibs into separate sub directories.
– A more centralised include system to avoid including files within files within files.
– Various poller performance improvements to boost overall efficiency.
– Switched OS definitions from being part of the global `$config` variable to using separate yaml based config files (`includes/definitions/*.yaml`)

Why have we put some focus in to these things? LibreNMS has become quite popular, more so than we could have expected and because of this naturally the number of issues have also increased whether that’s from feature requests or bugs. In trying to improve the code base and introduce unit testing we can hopefully lower the barrier of entry for people to contribute whilst ensuring the code stays clean and bug free.

One example of the improvements these changes have had is in the previously mentioned global `$config` array that we inherited. All of the softwares configuration was within that array which has it’s benefit that it’s available pretty much anywhere within the code base, however it also means that it’s a bloaty array and would continue to grow in size. A recursive count of `$config` before some of this refactor work stood the size at 4161 items – yes, over 4,000 items! After the refactor work, the recursive count of `$config` stands at 1696. That’s an enormous difference, over half where we started out.

Along that way, things broke, nothing badly but we have introduced regression and bugs in to the OS discovery process, the WebUI and sometimes data collection which we always strive to avoid. We can’t promise that this won’t happen again as we have a lot more work to do going forward but we will continue to aim to provide an amazing NMS for you to use that you can rely on. One thing we are doing to make sure things are better tested in the future is enabling more data collection for stats.librenms.org by collecting sysObjectIds and sysDescr for more robust unit testing, see https://community.librenms.org/t/greater-statistics-collection-for-improved-os-detection/441 for more info.

As always, if you want to get involved with the rest of the team @ LibreNMS then please get in touch – you don’t need to be able to code to help make a difference.

Happy monitoring

Remote monitoring using tinc VPN

This is a guest blog by Florian Beer

Remote monitoring using tinc VPN

This article describes how to use tinc to connect several remote sites and their subnets to your central monitoring server. This will let you connect to devices on remote private IP ranges through one gateway on each site, routing them securely back to your LibreNMS installation.

Configuring the monitoring server

tinc should be available on nearly all Linux distributions via package management. If you are running something different, just take a look at tinc’s homepage to find an appropriate version for your operating system: https://www.tinc-vpn.org/download/ I am going to describe the setup for Debian-based systems, but there are virtually no differences for e.g. CentOS or similar.

  • First make sure your firewall accepts connections on port 655 UDP and TCP.
  • Then install tinc via apt-get install tinc.
  • Create the following directory structure to hold all your configuration files: mkdir -p /etc/tinc/myvpn/hosts“myvpn” is your VPN network’s name and can be chosen freely.
  • Create your main configuration file: vim /etc/tinc/myvpn/tinc.conf
Name = monitoring
AddressFamily = ipv4
Device = /dev/net/tun
  • Next we need network up- and down scripts to define a few network settings for inside our VPN: vim /etc/tinc/myvpn/tinc-up
#!/bin/sh
ifconfig $INTERFACE 10.6.1.1 netmask 255.255.255.0
ip route add 10.6.1.1/24 dev $INTERFACE
ip route add 10.0.0.0/22 dev $INTERFACE
ip route add 10.100.0.0/22 dev $INTERFACE
ip route add 10.200.0.0/22 dev $INTERFACE

In this example we have 10.6.1.1 as the VPN IP address for the monitoring server on a /24 subnet. $INTERFACE will be automatically substituted with the name of the VPN, “myvpn” in this case. Then we have a route for the VPN subnet, so we can reach other sites via their VPN address. The last 3 lines designate the remote subnets. In the example I want to reach devices on three different remote private /22 subnets and be able to monitor devices on them from this server, so I set up routes for each of those remote sites in my tinc-up script.

  • The tinc-down script is relatively simple as it just removes the custom interface, which should get rid of the routes as well: vim /etc/tinc/myvpn/tinc-down
#!/bin/sh
ifconfig $INTERFACE down
  • Make sure your scripts scan be executed: chmod +x /etc/tinc/myvpn/tinc-*
  • As a last step we need a host configuration file. This should be named the same as the “Name” you defined in tinc.conf: vim /etc/tinc/myvpn/hosts/monitoring
Subnet = 10.6.1.1/32

On the monitoring server we will just fill in the subnet and not define its external IP address to make sure it listens on all available external interfaces.

  • It’s time to use tinc to create our key-pair: tincd -n myvpn -K
  • Now the file /etc/tinc/myvpn/hosts/monitoring should have an RSA public key appended to it and your private key should reside in /etc/tinc/myvpn/rsa_key.priv.
  • To make sure that the connection will be restored after each reboot, you can add your VPN name to /etc/tinc/nets.boot.
  • Now you can start tinc with tincd -n myvpn and it will listen for your remote sites to connect to it.

Remote site configuration

Essentially the same steps as for your central monitoring server apply for all remote gateway devices. These can be routers, or just any computer or VM running on the remote subnet, able to reach the internet with the ability to forward IP packets externally.

  • Install tinc
  • Create directory structure: mkdir -p /etc/tinc/myvpn/hosts
  • Create main configuration: vim /etc/tinc/myvpn/tinc.conf
Name = remote1
AddressFamily = ipv4
Device = /dev/net/tun
ConnectTo = monitoring
  • Create up script: vim /etc/tinc/myvpn/tinc-up
#!/bin/sh
ifconfig $INTERFACE 10.6.1.2 netmask 255.255.255.0
ip route add 10.6.1.2/32 dev $INTERFACE
  • Create down script: vim /etc/tinc/myvpn/tinc-down
#!/bin/sh
ifconfig $INTERFACE down
  • Make executable: chmod +x /etc/tinc/myvpn/tinc*
  • Create device configuration: vim /etc/tinc/myvpn/hosts/remote1
Address = 198.51.100.2
Subnet = 10.0.0.0/22

This defines the device IP address outside of the VPN and the subnet it will expose.

  • Copy over the monitoring server’s host configuration (including the embedded public key) and add it’s external IP address: vim /etc/tinc/myvpn/hosts/monitoring
Address = 203.0.113.6
Subnet = 10.6.1.1/32

-----BEGIN RSA PUBLIC KEY-----
VeDyaqhKd4o2Fz...
  • Generate this device’s keys: tincd -n myvpn -K
  • Copy over this devices host file including the embedded public key to your monitoring server.
  • Add the name for the VPN to/etc/tinc/nets.boot if you want to autostart the connection upon reboot.
  • Start tinc: tincd -n myvpn

These steps can basically be repeated for every remote site just choosing different names and other internal IP addresses. In my case I connected 3 remote sites running behind Ubiquiti EdgeRouters. Since those devices let me install software through Debian’s package management it was very easy to set up. Just create the necessary configuration files and network scripts on each device and distribute the host configurations including the public keys to each device that will actively connect back.

Now you can add all devices you want to monitor in LibreNMS using their internal IP address on the remote subnets or using some form of name resolution. I opted to declare the most important devices in my /etc/hosts file on the monitoring server.

As an added bonus tinc is a mesh VPN, so in theory you could specify several “ConnectTo” on each device and they should hold connections even if one network path goes down.

Building docs from Markdown using MKDocs to GitHub Pages

So in one of our last blogs we mentioned about our move to using mkdocs to build html from our markdown formatted documentation and then auto deploy this to GitHub Pages. We said we would expand on how we did this in case it’s useful to other projects.

If you are going to publish your docs as a custom domain on GitHub then you can only have one project within your organisation / account which is used to do this. If you’re just using the standard something.github.io then you can have as many of these as you like. For us, we use docs.librenms.org and because we already build www.librenms.org from GitHub we had to create a new organisation to support what we needed.

Firstly you will need to set the branch that GitHub uses to publish your site, this can be either gh-pages which is the default or you can override this with another branch (Custom domains defaults to master). This is now the branch that your deploy script will push the html files to.

GitHub Pages

Now time to organise some files for your main project, https://github.com/librenms/librenms in our case. Our docs are located in the folder doc/ within this repository. In the root of the repository we have a .travis.yml file and a deploy-docs.sh in the scripts/ folder. Our app is PHP based so we have Travis run some unit, lint and style checks so things might be a little different for you. What we do in this file is define the various PHP versions we run checks for and which checks will be run. The most important for the docs deployment is EXECUTE_BUILD_DOCS=true for one of the php versions. We don’t set this for more than one otherwise the deploy script will be invoked multiple times. After this, the after_success part is next: test $TRAVIS_PULL_REQUEST == “false” && test $TRAVIS_BRANCH == “master” && test $EXECUTE_BUILD_DOCS == “true” && bash scripts/deploy-docs.sh. This checks that Travis is being called when it’s a merge and NOT a pull request ($TRAVIS_PULL_REQUEST == “false”), that the branch that is being merged to is master ($TRAVIS_BRANCH == “master”) and we have defined the variable to build the docs ($EXECUTE_BUILD_DOCS == “true”). If all of these pass then we execute scripts/deploy-docs.sh. You will need to adjust $TRAVIS_BRANCH depending on which is your default branch in your MAIN repo.

An example .travis.yml file

 

We now move onto the deploy-docs.sh script, this is a custom script so you will need to update various variables and paths before you run this. Always remember that this script is run by Travis itself, the sequence of events is as you would if you built the docs manually and pushed them to your docs repository. At the top of the script you can define some variables which will be used later in the script. Don’t remove ${GH_TOKEN}, this is a variable that’s defined outside of the scope of the script and we will get to that later.

The pip commands are so that we can get the relevant software (mkdocs) and add-ons installed in to the Travis environment. You may need more than this depending on what dependencies within Travis you make use of. We build our docs into a folder called out/ which we create and then enter to setup the initial git environment and pull in the latest master branch. Next we need to go back a level and clone your main repo so that we can convert the markdown documentation to html.

Mkdocs is now called which and uses mkdocs.yml within the root directory of your repository. If everything works then your build should succeed and we move on to committing the changes and pushing them to your documentation repository. It’s important that you run git push with -q to suppress any output, if something goes wrong you could expose your GitHub token in the Travis output!

An example deploy-docs.sh file

GH_TOKEN – The variable we mentioned earlier. What is it and where does it come from? So to push to a GitHub branch you either need login credentials or a GitHub Personal access token. We make use of the latter here. You don’t want to just put the token into your deploy script as anyone who has this can deploy to your repository so we utilise the Environment Variables section within Travis. Set GH_TOKEN as the name, then enter the token you generated within GitHub as the Value, make sure Display value in build log is set to OFF – THIS IS IMPORTANT. Now click Add.

travis-environment

Ok you should be all set now, when someone submits a pull request to your project, Travis will ignore these for building docs and continue to process your other tests as normal. When the pull request is merged, Travis will build the docs as per the steps in deploy-docs.sh and you should automatically see the updated live version once Travis has completed its processes.

Community update

So it’s been a super crazy busy month for us and by the looks of it, many others who’ve contributed to LibreNMS!

The project has been growing month on month, more forks, more users but above all else – more contributors. I know one of the worries of Paul when he originally forked the software was that he wouldn’t be able to give it the time it needs and like some other open source projects which have come and gone, if it had only been Paul contributing and supporting people then without a doubt it wouldn’t have lasted. However as time has proven this wasn’t the case. We’ve hit over 250+ contributors this month which is amazing, we’re on our 18th stable monthly release, over 11k commits (this isn’t something to measure and this number will increase slowly now we are squashing commits on merge) and over 2k pull requests. Just looking back over the last month alone we’ve merged 240+ pull requests from 32 contributors – that’s around 8 pull requests per day and one unique contributor each day! August GitHub Pulse

What’s even better is we’re two months away from the 3 year anniversay of LibreNMS – who’s making cakes?

So what else has been going on I hear you ask?

docs.librenms.org

We launched our new documentation site (and for a good chunk of it, new or updated documentation as well). For the last couple of years we’ve been using the truely awesome readthedocs.org service – a big kudos to the team there, the service they provide has been brilliant. Before that we made do with docs in the repo’s doc folder and the Wiki available on GitHub repos. So what have we switched to and why? GitHub, but not just serving markdown within the main repo, we’ve actually made use of Travis to build the docs site using mkdocs.org which is then pushed to a new repo where it’s served via GitHub pages – we’ll do a blog post on this in more detail. This has given us the flexibility to customise the look and feel of the website (we’ve tried to keep it similar to our main website www.librenms.org) and also how the navigation works which also includes a pretty good search facility. We’ve now been able to provide an ‘Edit on GitHub’ link for each document so you can jump straight from reading to writing 🙂 As with everything we do, the docs, theme and code are all available on our public repos:

github.com/librenms/librenms
github.com/librenms-docs/librenms_theme

docs.librenms.org search

community.librenms.org

So this is kind of beta at present, we currently have a Google group but it’s mainly used for announcements and most of the support there is done by other users, hardly any of the core development team frequent it. The community site is trying to bridge that gap by having a place where users can ask questions and discuss ideas that may not be appropriate as a GitHub issue or can get lost in the daily noise on the irc channel. It’s certainly in it’s infancy and has had little time spent on it getting it set up but already people are gravitating to it. Over the coming months we’d like to see it looking more like a LibreNMS site and becoming more useful to people but only time will tell – as with anything we do, feedback is always appreciated.
REFACTORRRRRRRRRRRRRRR!

We have to give a big shout out to Tony Murray who has single handedly led the refactor movement. Our code base has a lot of inefficiencies in it, some from before the fork and some from us. We come across bugs on a constant basis and introduce more when we make changes which have a knock on effect – this is one reason why we’ve kick started the rewrite of the web stack. We’ve made some huge strides the last month to bring the code base into line with a coding standard for which we’ve settled on PSR2, we’ve been able to add checking for this into Travis so all code will be formatted the same going forward.

Unit testing is slowly been introduced. v2 requires it but v1 has never had anything in before. Slowly but surely this is being factored in where possible.

Lint checking, you’d hope that everyone would write perfect code and so do we 🙂 However one thing that has caught us out is support for PHP 5.3 and 5.4 so whilst this isn’t broken code, sometimes people default to the later use of things like defining arrays $test = [] for example, this just doesn’t work in php 5.3. We’re doing our best to maintain support for these versions and with lint checking now in we can see when code is introduced which may break this.

Travis will take care of checking these for you when you submit a pull request, however to speed things up you can run ./scripts/pre-commit.php to run the same checks so you can be sure when you submit a pull request it won’t need code fixes 🙂

Travis build statusOne of the other big areas of refactoring that’s gone on is in the rrdtool functions. Tony Murray has centralised all of our data calls (RRD and InfluxDB brought together) so that file names are standardised and a lot of repeated code has been removed. The rrdtool file creation and updates have been refactored so that we can now make use of rrdtool 1.5> to remotely create rrd files for those not wanting to use a shared filesystem.

Some of these changes have led to a few hiccups along the way but hopefully everyone agrees that it’s worth doing it now rather than plodding along with what we had before. It’s a good opportunity to remind people though that we do have a stable monthly release available for those that don’t want the latest bleeding edge code. You can switch out to this by changing to a release such as 20160828 (git checkout 20160828) and then updating config.php to use: $config[‘update_channel’] = ‘release’;

LibreNMS giving back

So it’s been a while since the last blog post so this is long overdue!

Since the LibreNMS summit in 2015, we’ve had a little bit of the donation money left over, which day to day we have no use or need for. We’ve considered buying some kit to develop / test against but most of the time this isn’t necessary. So aside from keeping it in the bank, what do we do with it?

We’re not talking about a huge amount of cash, about $260 in total but it makes sense to make use of it in some way. So the core development team had a chat and we thought the best use for it would be to contribute back to other Open Source projects that we make use of within LibreNMS. So that’s what we did, we’ve only done it once at the moment but we hope to be able to spread the remaining amount to at least one other project.

So where has the money gone? RRDTool, it’s one of the biggest reasons why LibreNMS works. The work that Tobias has done and continues to do on RRDtool is nothing short of amazing. We’ve all banged our head against a wall with it on the odd occasion but the reality is quite a few network monitoring platforms wouldn’t exist today if it wasn’t for RRDtool. You can see a list of people who’ve donated to RRDtool here: https://tobi.oetiker.ch/webtools/appreciators.txt

We wish we could do more, but we don’t generate revenue from the work we all put into LibreNMS so we just don’t have the resources available to make a bigger difference.

If you have any suggestions on where else we can donate the remainder of the money to ($130) then please drop a comment here or email team@librenms.org.

Summit report

Once again I must apologise that it has been too long since my last post!  It has been a busy few weeks at work since the summit, and also a busy few weeks for LibreNMS.

I’ve recently updated our web page to specifically thank those who contributed to the summit.  We are humbled by and grateful for your confidence in us.  Five people in total attended the summit:

  • Neil Lathwood, UK
  • Daniel Preußker, Germany
  • Søren Rosiak, Denmark
  • Mike Rostermund, Denmark
  • Paul Gear, Australia

All of the above are now registered committers/reviewers in the LibreNMS organisation on Github.

Our day started with personal introductions – many of us had not met in person until the weekend of the summit.  After a brief talk about some administrative issues, the morning was spent working through our future development priorities.  We then enjoyed lunch together, followed by an afternoon of reviewing issues and working on fixes.

2015-08-30 14.34.50 crop

Due to the provision of a meeting room by Canonical, our costs for running the summit were lower than expected, and we were able to pay for the accommodation and airfares not only for Daniel, but also Søren and Mike.  Here’s a breakdown of how we used the funds raised through the Indiegogo campaign:

  • Daniel Preußker – travel from Germany + accommodation in London – €492
  • Søren Rosiak – travel from Denmark + accommodation in London (for Søren and Mike) – €465
  • Paul Gear – lunch for the summit participants – €119
  • Neil Lathwood – remaining funds from summit for use in legal defence – €376

(The above amounts are not exact due to rounding and currency conversions, and exclude the fees from Indiegogo.  However, the above represents 100% of the usable funds from the summit campaign.)

The big items on our agenda on the technical side were:

  • Installation – The current setup works reasonably well, but we still have a steady stream of people coming through the IRC channel who manage to mess up permissions; we’d like to make this easier by creating a standard installer process that works out which distribution it’s running on and makes all the right adjustments.  https://github.com/joubertredrat had a first attempt at this, which we may use.  (There may be some issues dealing with things like SELinux, but Daniel feels this is solvable with a relatively small SELinux configuration.)
  • Documentation – There is plenty of improvement that could be made around installation, interacting with git, coding standards, and the FAQ.  We will work on this as time permits.  If you aren’t a coder, but would like to make a contribution to the project, this would be a great way to get involved!
  • Alerting – Daniel is working on the next version of alerting in which he hopes to incorporate both functionality and UI improvements.  The ability to share rules with other LibreNMS users was discussed, but no concrete plans have been made yet.  Your feedback would be appreciated: What works well? What doesn’t? What would you like to see?
  • Graphing engine – We are tracking several possibilities with respect to updated graphing engines that would be useful in overcoming some of our current limitations.  There are immediate plans to migrate, but we will continue to track the projects which look most viable.
  • Poller – There are a number of common requests we see from time to time:
    1. ping-only polling
    2. SNMP-only polling [this was recently added]
    3. polling at custom intervals
    4. other polling methods such as netflow, HTTP, NTP

    All of the above are achievable, but will require non-trivial changes to the existing codebase.

If you were a contributor to the campaign and are due some priority attention on an issue (I’ve had one reminder about this already), please get in touch with us via email: team at librenms dot org.  If you haven’t yet submitted an issue on github, please do so before emailing us (or just let us know in the email if there are reasons why you can’t do that).

Blown away by community

It has been quite a while since my last post, so I thought I’d break the radio silence and take the time for a look back at what we’ve been up to. A lot of water has gone under the bridge since my first commit to LibreNMS on 28 October 2013, and the time has flown!

My biggest fear when I started the project was that my lack of time would mean the codebase would languish, and I’d be left with something that worked, but didn’t have a viable future.  However, LibreNMS was born because I felt the need for a network monitoring system whose community:

The community which has gathered over the last 20 months or so has amazed me in fulfilling this vision.  It’s not an understatement to say that all of my original goals for LibreNMS have been met and exceeded already.

Particular thanks must be go to the other two core LibreNMS team members for their efforts:

  • Neil Lathwood has been far and away the most prolific coder on the project and has relentlessly pursued fixing bugs and adding features to benefit our user base.
  • Daniel Preussker wrote our alerting system and came on board as a code reviewer.  His emphasis on security and efforts to improve code quality have been a huge bonus to the team.

The last two months in particular have been a whirlwind, with the number of participants in the IRC channel, mailing list, and issue system showing a dramatic increase (possibly due to a few mailing list discussions and the odd reddit thread).  In IRC alone we went from around 10 regular channel participants to hitting 80 for the first time last week.

Our number of contributors has been growing rapidly in recent weeks, with various contributors joining to provide code to support their preferred devices.  I attribute a large portion of our success here to using git as our SCM and github as our method for collaboration – they make it easy to integrate and collaborate at the code level.  Many of our contributors have never even worked on a DVCS before!

It has been a privilege and a pleasure to see LibreNMS develop into a testimony to the power of community to make Free Software awesome.

In addition to the API which was integrated near the end of 2014 (see earlier blog posts), we’ve made some great progress on other features, including:

  • a customisable alerting system which includes integrations with Slack, HipChat, PagerDuty, and Pushover
  • updating bootstrap to a more recent version and extending its use to various tables in the system through bootgrid
  • added a distributed poller to allow segmenting and scaling for load
  • directly integrated the documentation with our git repository using Read the Docs
  • added or improved support for dozens of device types, including many relevant to Wireless Internet Service Providers (WISPs)

There are many other fixes and improvements as well – see the changelog for full details.

It has been a wild ride so far with LibreNMS, and I’m both thankful to all who’ve contributed to our community so far, and excited at what the future holds.

API merged

In case anyone hasn’t noticed Neil’s blog post or our Twitter feed, we’ve recently merged in Neil’s API work, along with a few updates from me.  The API is based at /api/v0 in your LibreNMS install; it is marked as v0 to signify that it is a pre-stable interface – please do not assume that any part of the API is guaranteed stable until we mark it as v1.  As you may have gathered from my previous notes about API design, there are some interesting new developments afoot in the world of RESTful APIs, and we’d like to work towards implementing the best possible API design.

Please note that the implementation of the API is still a work in progress.  Known issues at the moment are:

  1. Incomplete checking of user permissions
  2. No security auditing or hardening has been done
  3. Encoding of interface names containing slashes needs to be tested
  4. LibreNMS doesn’t come with any specific support or documentation for setting up HTTPS by default
  5. Creation of API tokens is still manual

Because of these issues, I recommend that you do not expose your LibreNMS install’s API to untrusted systems, and especially do not make it available on public web sites.  I expect we’ll issue an update to the documentation soon to provide specific guidance for locking down the API, and hope that we’ll have a number of code updates to address the above implementation issues shortly.