Raspberry Pi + Azure Functions + Cognitive Services + Microsoft Teams -> How a Pi informs us about lunch

Introducing MoCaDeSyMo

Like every other office, ours also has different opportunities for lunch. You can take your break and visit the cafeteria, you can order food from some local restaurants or you can just step outside and buy something from the food truck. I’m a regular food truck buyer, but this option has some downsides. My office is on the other side from the front door, so I don’t see the guy arriving. This means I end up “pinging” colleagues with a direct line of sight regularly in Skype for Business in order to get some information about when I should go for lunch. Right now there are only a few people in our team with a window facing the front door so they end up getting pinged a lot. To end their misery we came up with the idea of some type of push notification.  The first idea was about writing a mobile app that the driver of the food truck installs on his phone. This app should recognise our GPS coordinates and push a simple message to Azure upon arrival at our front door. This could have worked but had a dependency on the driver’s phone and his free will to join the idea. After some further thinking, we came up with the idea of using a Raspberry Pi that takes pictures of the scene and uploads them and uses cognitive services to check whether the food truck is present or not. So without further ado, here is MoCaDeSyMo MoCaDeSyMo(mobile carbohydrate delivery system monitor) is the red fellow on the left of the picture. Insides his red body is a Raspberry Pi 3 and the default Pi camera module.  On weekdays between 10:45 and 11:30 MoCaDeSyMo takes a picture every minute and tries to detect the food truck of a local snack bar called “Imbiss Rainer”. His main target is to recognise something like this: To make sure everyone on the team knows about the arrival of the truck MoCaDeSyMo uses Azure Cognitive Services to predict the presence of the food truck and if the probability is above a given threshold a Microsoft Teams connector gets triggered and we get notified with a message like this: This post discusses how we created the system based on this high-level architecture:

The Raspberry part

In order to make this possible we use a simple Raspberry Pi 3 with a camera module and a default Raspbian Jessie image as it’s operating system. Raspbian has the best tools for the Pi camera and as taking pictures is the main task we decided to stick with this operating system and did not install Windows 10 IoT Core. Also with this decision, we are using Linux as our base system to talk to Azure, a new experience for us but something highly recommended given the fact that the shell script we are using has only 35 lines of code.```

!/bin/bash

PATH=$PATH:/home/pi/bin export AZURE_STORAGE_ACCOUNT=’%YOUR_STORAGE_ACCOUNT%' export AZURE_STORAGE_ACCESS_KEY=’%YOUR_ACCESS_KEY'

container_name=’%YOUR_CONTAINER%'

today=`date ‘+%Y-%m-%d-%H-%M-%S’`; filename="$today.png"

echo “taking the picture…” raspistill -o $filename

echo “croping the picture …” mogrify -crop 1487x619+907+904 $filename

echo “logging into azure…” az login –service-principal -u %USER_GUID% –password “%YOUR_STRONG_PWD%” –tenant %TENANT_GUID%

echo “uploading image” az storage blob upload –container-name $container_name –file $filename –name $filename

echo “deleting the image” rm $filename

echo “logging out from azure” az logout

echo “triggering azure function …” image_param=‘image=https://%YOUR_STORAGE_ACCOUNT%.blob.core.windows.net/%YOUR_CONTAINER%/’ function_url=’%YOUR_FUNCTION-URL’

curl -G $function_url -d $image_param$filename

echo "" That's all the code running on the Pi. It takes a picture and crops it to focus only on the part we are interested in. To take the picture we use a tool called raspistill that has many features and has a lot to offer. In order to take some training pictures for the cognitive services API we used just one command: raspistill -t 30000 -tl 2000 -o image%04d.jpg


Azure Function
--------------

After uploading the picture, we trigger an Azure Function with the URL of the new item in the blob storage. To get some prediction data on our image the function calls a Custom Vision endpoint:```
client.DefaultRequestHeaders.Add("Accept", "application/json");
client.DefaultRequestHeaders.Add("Prediction-Key", "%YOUR\_PREDICTION\_KEY%");
var response = await client.PostAsync(String.Format("%YOUR\_CUSTOM\_VISION\_ENDPOINT%"), requestData);
var result = await response.Content.ReadAsStringAsync();
```This service is available at [https://www.customvision.ai](https://www.customvision.ai) and is available as a preview for a few weeks. Within this service, you can create a project to provide image recognition predictions. "Easily customize your own state-of-the-art computer vision models that fit perfectly with your unique use case. Just bring a few examples of labeled images and let Custom Vision do the hard work." This is the marketing slogan, and four our example it exactly delivered that.  We used 77 pictures of the truck present and 164 images without it to train the system via the web interface.  With just one iteration of training we get results like this one: ![](http://www.modernworkplacesolutions.rocks/wp-content/uploads/2017/06/test1.png) Above we use the URL of the image in the blob storage within in the Quick Test feature of Custom Vision services. It gives us 100% false because I tagged all the images without the food truck as "false".  Below I uploaded a picture from my hard drive with the truck present and we get a 100% probability of true. We also get a 1,1% of false, which is something I need to investigate in detail because this gives us a sum of 101,1% in terms of probability which sounds wrong to me. But maybe it's a feature I'm not aware of. (UPDATE: Of course I was wrong. The probabilities are not exclusive, the image recognition service gives back the probability of categorising the image with a certain tag. It doesn't know that they are mutual exclusive hence it gives back the probability of each tag.) ![](http://www.modernworkplacesolutions.rocks/wp-content/uploads/2017/06/test2.png) The endpoint we called in our custom vision project returned a JSON object:```
{\\"Id\\":\\"%ID%\\",\\"Project\\":\\"%PROJECT\_ID%\\",\\"Iteration\\":\\"%ITERATION\_ID%\\",\\"Created\\":\\"2017-06-26T16:07:48.1746322Z\\",\\"Predictions\\":\[{\\"TagId\\":\\"7c24f243-ed2c-4f20-b35f-88b7b96b1711\\",\\"Tag\\":\\"False\\",\\"Probability\\":1.0},{\\"TagId\\":\\"9721bc40-c00c-460d-97e4-67a269ce543a\\",\\"Tag\\":\\"True\\",\\"Probability\\":3.00917918E-07}\]}
```To parse the response we use some class definitions that can be found in a sample project ([http://aihelpwebsite.com/Blog/EntryId/1025/Microsoft-Cognitive-Custom-Vision-Service-ndash-A-Net-Core-Angular-4-Application-Part-One)](http://aihelpwebsite.com/Blog/EntryId/1025/Microsoft-Cognitive-Custom-Vision-Service-ndash-A-Net-Core-Angular-4-Application-Part-One)```
public class CustomVisionResponse
    {
        public string Id { get; set; }
        public string Project { get; set; }
        public string Iteration { get; set; }
        public string Created { get; set; }
        public List<Prediction> Predictions { get; set; }
    }
public class Prediction
    {
        public string TagId { get; set; }
        public string Tag { get; set; }
        public string Probability { get; set; }
    }
```With this objects, we only need to take the TagId from the response above and query the response for the probabilities of the tags we are interested in.  Final thing is to call our Microsoft Teams connector to inform us about the given probabilities of the image taken:```
using (var teams\_client = new HttpClient())
{
   var json\_ = "{\\"text\\":\\""+output+ link\_to\_img+"\\"}";
   log.Info(json\_);
   var response\_ = await client.PostAsync("%YOUR\_TEAMS\_CONNECTOR\_URL%", new StringContent(json\_, System.Text.Encoding.UTF8, "application/json"));
   var result\_ = await response\_.Content.ReadAsStringAsync();
}

Conclusion

Within only a few hours we were ready with a real prototype that uses our Raspberry Pi to take pictures and uploads them to Azure. Within an Azure Function we call the Custom Vision API to retrieve the probabilities of our custom image recognition algorithm. This information is then pushed to a Microsoft Teams channel. All that within roughly 8-9 hours of work over three days. Of course, the code isn’t production ready and the use case is very simplistic and maybe nothing our customers can relate to immediately. But for sure it shows what is possible with the current technology and some imagination and creativity.