Apache NiFi: NetFlow to Syslog
Most organizations have flows enabled at some level. This data can be used for anything from troubleshooting to statistics to security. There are many types of flow formats available, e.g. NetFlow, sFlow, IPFIX, etc. For this example, we’ll be using Apache NiFi to ingest NetFlow v5 and output the resulting flow information to a syslog server.
Let’s start with the basics of enabling NetFlow on Ubuntu. The Ubuntu instance already has a port mirror of all Internet traffic on my home network on interface eth1. Installation and configuration is simple:
sudo apt install fprobe
…and that’s it. You’ll get prompted to specify the interface and NetFlow destination. In this case, we’ll be using NiFi as the destination on port 2055/UDP
. The next step is to build the flow in NiFi.
Let’s first start by creating a ListenUDP
processor that receives data on port 2055/UDP
. Drag the ListenUDP
processor onto the canvas and double-click to open up the settings. Go to Properties
and plug in your interface and 2055
for the port number. You’ll end up with something like this:
Next we’ll create a ParseNetflowv5
processor. Add the processor to the canvas, double-click, and go to Settings
. We’ll want to check Failure
and Original
under Automatically Terminate Relationships
since we only want to pass along the parsed NetFlow record. There is nothing else to change here since the processor natively understands the NetFlow records.
Last, we’ll want to use the PutUDP
processor to send the processed data back out to a syslog server. Add the PutUDP
processor to the canvas, double-click, and again go to Settings
. Check Success
and Failure
under Automatically Terminate Relationships
since this is the last processor in the flow. Next, go to Properties
and add the syslog hostname and destination port.
The last step is to connect the processors. First, connect the GetUDP
processor to the ParseNetflowv5
processor using Success
for the relationship. Next, connect the ParseNetflowv5
processor to the PutUDP
processor, also using Success
for the relationship. Finally, right-click on the canvas and select Start
. You should see the processors turn green.
At this point the data should be showing up in our syslog server. I’m using Elasticsearch in this example. You should be able to search for the phrase netflowv5
to see examples of the data:
Success! Keep in mind there will be a lot of data on a busy network.