SCHMAUSTECH: MQTT

Showing posts with label MQTT. Show all posts

Tuesday, February 27, 2024

Exploring Golang with MQTT: File Transfers

While I've dabbled in writing scripts using Perl, Bash, and Python, I wouldn't necessarily label myself as a developer. However, I do have a penchant for automation, organizing logic, and embracing challenges. It was this innate curiosity that led me to explore Golang recently and experiment with its integration with MQTT. The harmonious combination between MQTT's lightweight messaging protocol and Go's concurrency model presents a compelling case for utilizing MQTT with Golang.

MQTT (Message Queuing Telemetry Transport) is a lightweight publish-subscribe messaging protocol designed for efficient communication between devices with limited bandwidth and processing capabilities. It excels in scenarios where real-time data exchange is crucial, such as IoT (Internet of Things) applications, telemetry systems, and messaging platforms.

Golang, or Go, is a modern and efficient programming language known for its simplicity, concurrency support, and performance. It provides built-in support for concurrent programming through goroutines and channels, making it ideal for building highly concurrent and scalable systems.

When you combine MQTT with Go, you leverage the strengths of both technologies to create robust, scalable, and real-time communication systems. Here's why this combination is so compelling:

Efficiency: Both MQTT and Go are designed for efficiency. MQTT's lightweight protocol minimizes bandwidth and processing overhead, making it suitable for resource-constrained environments. Go's efficient runtime and concurrency model allow you to handle a large number of concurrent connections and process messages concurrently with minimal overhead.

Concurrency: Go's built-in support for concurrency with goroutines and channels aligns perfectly with MQTT's asynchronous messaging paradigm. You can easily handle thousands of concurrent MQTT connections and process incoming messages concurrently, leveraging the power of parallelism to scale your applications.

Simplicity: MQTT and Go are both known for their simplicity and ease of use. With the paho.mqtt.golang library, integrating MQTT into your Go applications is straightforward and intuitive. You can quickly connect to MQTT brokers, publish and subscribe to topics, and handle messages with minimal boilerplate code.

Scalability: The combination of MQTT and Go enables you to build highly scalable systems that can handle massive workloads with ease. Whether you're building IoT platforms with millions of devices or real-time messaging systems with high throughput requirements, MQTT and Go provide the scalability you need to meet your application's demands.

Fast forward to my little Go project which came out of some of my research around transferring files via MQTT. While maybe not the most practical I had read of others doing it with Python and even just using the Mosquito publisher and subscriber tools. But again the goal here was to learn a little about Go and tie it into something that motivated me to figure it out. Hence my own file transfer publisher and subscriber written in Go.

The publisher code, which can be found here, does the following:

Establishes a connection to the broker server with the topic of transfer
Watches the following directory on where it is run: /root/outbound
Any files that are dropped into the directory are then published to the broker on the topic of transfer

Note: MQTT has a 256MB limit of data it can transfer. Further it will chunk the data up into segment messages which creates another challenge.

The subscriber code, which can be found here, does the following:

Establishes a connection to the broker server with the topic of transfer
Listens for any published messages on the topic transfer
When a message is published, in this case a file, the subscriber pulls it down and places it into the /root/inbound directory
The subscriber will also try to determine the file type by using the mimetype library and looking at the first 512 bytes of the file. If it cannot be determined it defaults to a .txt extension.

The go.mod file used for the project can be found here.

For the experiment I simply created the two inbound/outbound directories on my system. In separate terminals I went ahead and ran each of the Go programs. In a third terminal I set up a watch on the directory listing for inbound. Then in a fourth terminal I went ahead and used the copy command to place some files into the outbound directory which the publisher code was watching. The demo is below:

Now for a first pass at this experiment I was fairly pleased but there is definitely room for improvement in the following items:

I want to be able to pass as arguments my MQTT server, the topic and what directory to watch.
I need to figure out a better way to handle file names on the subscriber side. Currently all the filenames end up having the name file with a Unix timestamp and then maybe if identified right the correct extension. One thought I have for this is to actually bundle up the file on the publisher side into a JSON payload where we have the file name, the extension, the size and then the actual data blob. Then on the subscriber side we would get that file and process it on receipt to obtain the real file name, extension and data of the file.
We need to handle large files better since they are chunked up so on the subscriber side we need to be able to take the chunks and assemble them back together and then process them.

Using MQTT with Golang allows you to leverage the lightweight, efficient messaging protocol of MQTT and the concurrency and scalability of Go to build robust, scalable, and real-time communication systems. Whether you're building IoT applications, telemetry systems, or messaging platforms, this combination provides the performance, efficiency, and simplicity you need to succeed.

Monday, November 06, 2023

Is Edge Really a New Concept?

In 1984, John Gage from Sun Microsystems coined the phrase "The Network is the Computer". In making the statement, he was putting a stake into the ground that computers should be networked otherwise they are not utilizing their full potential. Ever since then, people have been connecting their servers, desktops and small devices to the network to provide connectivity and compute to a variety of locations for varying business purposes.

Take, for example, when I worked at BAE Systems back in the 2008-2010 period. We already had remote unmanned sites where we had compute that was ingesting data from tests and sensors. Further, we had to ensure that data was kept integral for compliance and business reasons. Developing an architecture around this to ensure reliable operation and resiliency was no small feat. It involved the integration solution of multiple products to ensure the systems were monitored, the data was stored locally, backed up, deduplicated and then transferred offsite via the network for a remote stored copy. No small feat given some of these sites only had a T1 for connectivity. However, it was a feat we were able to accomplish and did it all without using the ever popular "edge" marketing moniker.

Fast forward today and all the rage is on edge, edge workloads and edge management. As a marketing tool, the use of the word "edge" has become synonymous with making decisions closer to where a business needs them made. But I was already doing that back in 2008-2010 at BAE Systems.

The story marketing departments and product owners are missing is that, in order for me to do what I did back then, it took a highly technical resource to architect and build out the solution. In today's world, many businesses do not have the luxury of those skilled resources to take the building blocks to build such systems. These businesses, in various industries, are looking for turnkey solutions that will allow them to achieve what I did years ago in a quick and cost efficient manner while leveraging potentially non-technical staff. However, the integration of what I did into a turnkey product that is universally palatable across differing industries and customers seems daunting.

Businesses vary in how they define edge and what they are doing at the edge. Take, for example, connectivity. In some edge use cases like my BAE Systems story or even retail, connectivity is usually fairly consistent and always there. However, for some edge use cases like mining where vehicles might have the edge systems onboard, the connectivity could be intermittent or be dynamic in that the ip address of the device might change during the course of operation. This makes the old push model method and telemetry data gathering more difficult because the once known ip address could have changed and yet the central collector system back in the datacenter has no idea about the devices new ip address identity. Edge, in this case, requires a different mindset when approaching the problem. Instead of using a push or pull model, a better solution would be leveraging a message broker architecture like the one below.

In the architecture above, I leverage an agent on our edge device that subscribes and publishes to a MQTT broker and on the server side I do the same. That way, neither side needs to be aware of the other end's network topology, which is ideal when the edge devices might be roaming and changing. This also gives us the ability to scale the MQTT broker via a content delivery network so we can take it globally. Not to mention, the use of a message broker also provides a bonus of being able to allow the business to subscribe to it, enabling further data manipulation and enhancing business logic flexibility.

Besides rethinking the current technological challenges at the edge, we also have to rethink the user experience. The user experience needs to be easy to instantiate and consume. In the architecture above, I provided both a UI and an API. This provides the user with both an initial UI experience to help them understand how the product operates but also an easy way to do everyday tasks. Again, this is needed because not everyone using the product will have technical abilities, so it has to be easy and consumable. The video below shows a demonstration of how to do an upgrade of a device from the UI. The UI will use the message broker architecture to make the upgrade happen on a device. In the demo, I also show on the bottom left a terminal screen of what is happening on the device as the upgrade is rolling out. I also provide a console view of the device on the lower right so we can view when the device is rebooted.

After watching the demo, it becomes apparent that the ease of use and simple requests is a must for our non-technical consumers at the edge. Also, as I mentioned above, I do have an API, so one could write automation against this if the business has those resources available. The bottom line, though, is that it has to be easy and intuitive.

Summarizing what we just covered, let's recognize edge is not a new concept in the computing world. It has existed since the time computers were able to be networked together. Edge in itself is a difficult term to define given the variances of how different industries and the businesses within them consume edge. However, what should be apparent is the need to simplify and streamline how edge solutions are designed given that many edge scenarios involve the use of non-technical staff. If a technology vendor can solve this challenge either on their own or with a few partners, then they will own the market.

Thursday, September 14, 2023

MQTT, Telemetry, The Edge

When we hear the term edge, depending on who we are and what experiences we have had, we tend to think of many different scenarios. However one of the main themes in all of those scenarios, besides the fact that edge is usually outside of the data center and filled with potential physical and environmental constraints, is the need to capture telemetry data from all of those devices. The need to understand the state of the systems out in the wild and more importantly to be able to capture more detail in the event the edge device goes sideways. Now the sheer numbers of fleet devices will produce a plethora of data points and given we might have network constraints we have to be cognizant of how to deliver all that data back to our central repository for compliance and visibility. This blog will explore the possibilities of MQTT providing a solution to this voluminous problem.

For those not familiar with MQTT, it is a protocol developed back in 1999. The main requirement for the protocol was the transfer of data in networks with low bandwidth and intermittent connections. MQTT was developed primarily for system to system interaction which makes it ideal for connecting devices in IoT networks for either control action, data exchange or even device performance. Further it implements a bi-directional message transmission so a device can receive and send payloads to other devices all without knowing those other devices network details. Perfect for use cases like planes, trains and automobiles where the ipaddress state might be dynamic and change.

MQTT has three primary "edgy" features:

Lightweight
Easy to implement and operate
Architecture of a publisher-subscriber model

Let's explore a bit about each of these features. First its lightweight and that means the protocol is able to work on low-power devices like microcontrollers, single board computers to systems on chip (SoC). This is definitely important since some of these devices are small and operate on battery power. The lightweight aspect also imposes minimal requirements and costs on the data moved across the network. This quality is provided by a small service data header and a small amount of actual payload data transmitted. And while the maximum size of the transmitted data in MQTT could be 256Mb, usually data packets only contain a few hundred bytes at a time.

The second feature of MQTT is the simplicity of the implementation and operations. Because MQTT is a binary protocol which does not impose restrictions on the format of the data transmitted, the engineer is free to decide what the structure and format of the data. It can be a number of formats like plain text, csv or even the common JSON format. The format is really dependent on the requirements of the solution being built and the medium the data transmission rides across. Along with the openness of how the data is transmitted the protocol has both control packets to establish and control the connection along with a mechanism based on TCP to ensure guaranteed delivery.

Finally the architecture of MQTT differs from other classic client server configurations in that it implements a publisher-subscriber model where clients can do both but do not communicate directly with other clients and are not aware of each others existence on the network. The interaction of the clients and the transfer of the data they send is handled by an intermediary called a message broker. The advantages of this model are:

Asynchronous operation ensuring there is no blocking while waiting for messages
Network agnostic in that the clients work with the network without knowing the topology
Horizontal scalability which is important when thinking of 10k to 100k devices
Security protection from scanning because each client is unaware of the other clients IP/MAC

Overall the combination of the primary "edgy" features makes MQTT an ideal transport protocol for large amounts of clients needing to send a variety of data in various formats. Thus making MQTT attractive in the edge space for device communication.

MQTT could also be perfect for telemetry data at the edge and to demonstrate the concept we can think about edge from an automobile perspective. Modern cars have hundreds of digital and analog sensors built into them which generate thousands of data points in a high volume of frequency. These data points are in turn dumped as a broadcast onto a vehicles Controlled Area Network(CAN) data bus which in turn could be listened to with a logger or MQTT client to record all of the messages they are sending. The telemetry data itself can be divided into three general categories:

Vehicle parameters
Environmental parameters
Physical parameters of the driver

The collection of these data points in those sub categories enables manufacturers and users of the vehicle to achieve goals like monitoring, increased safety of the driver, increased fuel efficiency, time to resolution on service diagnosis and even in some cases the state of the driver themselves.

Given the sheer volume of the data and the need to structure it in some way compounded by the number of cars on the road MQTT provides a great way to horizontally scale and structure data. The design details will be derived based on requirements of the telemetry needs and where constraints might exist along the path to obtaining the data points.

Take for example how we might structure the data for MQTT from the automobile sensors. In one case we could use MQTTs topic structure and have a state for each item we want to measure and transmit:

schmausautos_telemetry_service/car_VIN/sensor/parameter/state

schmausautos_telemetry_service/5T1BF30K44U067947/engine/rpm/state
schmausautos_telemetry_service/5T1BF30K44U067947/engine/temperature/state
schmausautos_telemetry_service/5T1BF30K44U067947/engine/fuel/state
schmausautos_telemetry_service/5T1BF30K44U067947/engine/oxygen/state

schmausautos_telemetry_service/5T1BF30K44U067947/geo/latitude/state
schmausautos_telemetry_service/5T1BF30K44U067947/geo/longitude/state
schmausautos_telemetry_service/5T1BF30K44U067947/geo/elevation/state
schmausautos_telemetry_service/5T1BF30K44U067947/geo/speed/state
schmausautos_telemetry_service/5T1BF30K44U067947/geo/temperature/state

This option relies on MQTTs ability to create a semantic structure of topics. Each topic is specific to a particular sensor and can be accessed individually without the need to pull additional data. The advantage of this option is that both the client and broker can transmit and access respectively the indicators of interest. This reduces the amount of transmitted data which reduces the load on the network. An appropriate option where wireless coverage is weak and/or intermittent but parameter control is required because transmitting a few bytes of parameter data is easier then a full dump of data.

A second option for the same type of data might be using the JSON data format and combining all of the sensor data into a single hierarchical message. Thus when accessing the specific vehicles topic the whole of all vehicle data is passed in a key pair value format. The advantage of this method is that all parameters are available on a single request. However because of this and the potential for large data sized messages it will increase load on the network. Further it will also require something to serialize and deserialize the JSON string at he client ends of the MQTT interchange. This method is more useful when there is a reliable network connection and coverage.

schmausautos_telemetry_service/car_VIN/state

{
  engine: {
   rpm: 5000,
   temperature: 90,
   fuel: 80,
   oxygen: 70,
  },
  geo: {
   latitude: 45.0101248,
   longitude: -93.0414592,
   elevation: 2000,
   speed: 60,
   temperature: 65,
  },
  ...
}

Either option again based on constraints in the requirements could be valid and useful. But overall they show the flexibility of MQTT and its ability to handle both the sheer scale and the amount of telemtry data coming in from the vehicles multiple sensors and sources multiplied by the number of vehicles in the fleet.

Hopefully this blog provided some insight into MQTT and its use for telemetry at the edge. MQTT while an old protocol was designed from the beginning for these edge type use cases. Use cases that require low power consumption, easy of operation and flexibility to consume and present data in many formats. And while we explored using MQTT as a method for telemetry data there are certainly more uses for MQTT in the edge space.

Wednesday, November 09, 2022

Monitoring Sensors and Taking Action

I recently wrote a blog around using Microshift to run my Zigbee2MQTT workload. This blog described all the details on how to deploy Microshift and then deploy the components inside of Microshift to enable some home automation. Of course with Zigbee2MQTT there is an intuitive web interface to interact with the smart devices. However I wanted to take another approach that felt more realistic when it comes to edge use cases. I felt that in a industrial scenario there would be some code that would most likely subscribed and monitoring the MQTT queue. An action would be performed when a certain event was observed and the action itself might publish something into the MQTT queue. The rest of this blog will cover a simple scenario like I just described.

First we continue to use the same lab environment I used in my previous blog. The only difference here in the diagram below is we have now added a smart power outlet and a temperature/humidity sensor that can both be controlled remotely via the Zigbee protocol like all my other devices.

The Script

With my lab in place I decided I wanted ot write something in Perl. Some might think why use such an antiquated language like Perl and part of that is because I am old school. For my scenario I envisioned using the humidity sensor to detect when the humidity levels got too high. The threshold would then trigger an action on the event to turn on/off a dehumidifier plugged into the smart outlet. The basic process flow looks like the following diagram:

The script itself can take four different parameters:

--hostname: hostname or IP address of MQTT host (required)
--port: port for MQTT (optional but will default to 1883 if not provided)
--threshold: humidity value that determines when action should be taken
--help: prints the usage of script

The script itself is located here

When one runs the script without any flags the usage and an example will be displayed.

./mqtt-humidity.pl 
Usage:
      --hostname,-h   Hostname or IP address of MQTT host
      --port,-p       Port for MQTT (defaults to default 1883)
      --threshold,-t  Threshold for humidity (defaults to 60)
      --help,-h       Print this help

    Example:

    mqtt-humidity.pl -ho 10.43.26.170 -p 1883 -t 65

The Demonstration of Script

To demonstrate this script I went ahead and plugged in a light into my smart outlet which was in the off setting. I launched the script in a terminal window. Then I took the temperature/humidity sensor, cupped it in my hands and blew into my hands. The moisture in my breath is enough to temporarily raise the value. The script provides output so we can see the values changing and sure enough when I breathed into my hands with the sensor the value jumped to 81.37% which triggered the action event and turned on the light. I then set the sensor back on my desk and over the course of 5 minutes the value slowly receded. Once it dropped below the threshold value the light then turned back off. The output of my script run is below:

$ perl mqtt-humidity.pl -ho 10.43.26.170 -p 1883 -t 60
Temp C = 23.43 : Temp F = 74.174 : Humidity = 51.22
Temp C = 23.43 : Temp F = 74.174 : Humidity = 81.37 <-- Smart outlet turned on
Temp C = 23.43 : Temp F = 74.174 : Humidity = 84.37
Temp C = 23.43 : Temp F = 74.174 : Humidity = 82.37
Temp C = 23.43 : Temp F = 74.174 : Humidity = 80.28
Temp C = 23.43 : Temp F = 74.174 : Humidity = 78.79
Temp C = 23.63 : Temp F = 74.534 : Humidity = 78.79
Temp C = 23.63 : Temp F = 74.534 : Humidity = 73.21
Temp C = 23.63 : Temp F = 74.534 : Humidity = 74.34
Temp C = 23.63 : Temp F = 74.534 : Humidity = 72.91
Temp C = 23.63 : Temp F = 74.534 : Humidity = 71.65
Temp C = 23.63 : Temp F = 74.534 : Humidity = 70.55
Temp C = 23.63 : Temp F = 74.534 : Humidity = 69.15
Temp C = 23.63 : Temp F = 74.534 : Humidity = 67.93
Temp C = 23.63 : Temp F = 74.534 : Humidity = 66.57
Temp C = 23.63 : Temp F = 74.534 : Humidity = 64.87
Temp C = 23.63 : Temp F = 74.534 : Humidity = 63.28
Temp C = 23.63 : Temp F = 74.534 : Humidity = 62.08
Temp C = 23.63 : Temp F = 74.534 : Humidity = 60.71 
Temp C = 23.63 : Temp F = 74.534 : Humidity = 59.12 <-- Smart outlet turned off
Temp C = 23.63 : Temp F = 74.534 : Humidity = 57.72
Temp C = 23.63 : Temp F = 74.534 : Humidity = 56.7
Temp C = 23.63 : Temp F = 74.534 : Humidity = 55.6
Temp C = 23.63 : Temp F = 74.534 : Humidity = 54.23
Temp C = 23.63 : Temp F = 74.534 : Humidity = 53.21
Temp C = 23.63 : Temp F = 74.534 : Humidity = 52.14
Temp C = 23.63 : Temp F = 74.534 : Humidity = 51.05
Temp C = 23.63 : Temp F = 74.534 : Humidity = 50.04
Temp C = 23.22 : Temp F = 73.796 : Humidity = 50.04
Temp C = 23.22 : Temp F = 73.796 : Humidity = 50.04
^C

Now this was a very simple example but imagine the possibilities. For example what if this was a greenhouse that needed to keep the humidity and/or even the temperature at a certain range. If the device that reduces the humidity/temperature (dehumidifier -or- exhaust fan) in the greenhouse could take Zigbee commands directly and control the speed of operation we might be able to not only turn it on/off but also increase/decrease speed of operation. All of this ensures that whatever was growing in the greenhouse is not damaged and also ensures we are powering devices only when we need to have them powered. The bottom line is it saves businesses like the greenhouse operational costs when they are operating efficiently.