Synthetic data is the use of statistics or algorithms to produce a facsimile of an original dataset or element. The key premise is that the data is not shared, but its statistical properties are, along with the presence of anomalies like outliers.
Most synthetic data is seen as being useful for preserving privacy or anonymity, by allowing the development of dashboards or algorithms without divulging the dataset to the developers, or for use in non-production environments for similar reasons.
Additional use cases outside the privacy realm could be exploring niche permutations of data; synthesizing data to represent an edge condition…
I sought to build a more comprehensive MLOps pipeline and solution after previously discussing my experience with the integration between GitHub and Azure ML Workspaces. The only prerequisites for this are a GitHub repo and Terraform Cloud account. The repository we’ll be working from is here.
Since I want to develop an MLOps environment that is fully automated with infrastructure as code, I’ll be using Terraform and need a Terraform Cloud account (free). A good overview for setting up Terraform Cloud with GitHub Actions can be found here (though this is AWS centric so some is not relevant).
For my hobby data science projects I’ve come to like Paperspace Gradient. The machines (containers) start up quickly, they have machine types that are very affordable on an hourly basis, pre-mount data directories for you and make you configure an auto-shutdown time from the start, avoiding billing surprises. All of that good stuff comes with the downside that, for hobbyists, they don’t really support CI/CD, those features appear to be reserved for Enterprise customers. So I set it up using Azure.
I want a pipeline connected to my GitHub repo that automatically trains my model, logs performance and, if necessary…
I am going to train a toy algorithm and deploy it for inferencing on an Arduino Nano 33 BLE Sense. I am seeking to build and test a shell using the fewest possible components, to be enhanced later.
I’ll be using the AutoMPG Dataset, training a model that will use one feature, Horsepower, to predict the vehicle’s miles per gallon. We will interact with the model using the Arduino Serial Monitor.
The training notebook to follow along.
I won’t spend a lot of time on data preparation, it’s a fairly straightforward dataset, the one note is, since I want to…
An ontology structures and formalizes objects in a domain. It gives us a way to think about how to relate ideas and in order to structure and contextualize knowledge.
Ontologies support logical deductions, using reasoners, and linking knowledge from different sources.
Let’s try building an ontology of Starbucks coffees and their characteristics. We will then use this to classify new, unknown coffees.
You can follow along with this notebook.
So with this sketch of a relationship model, let’s code a basic ontology using owlready2, defining a Coffee, its Roast and Region as a starting point.
Owlready allows us to…
So now we have the environment setup and need to publish to the Azure Function. This can either be done from Visual Studio Code with the Azure Functions plug-in, or we can set it up as a GitHub Action. The GitHub Action means it’ll publish and update to Azure every time we commit, that’s what I’ve setup and you can see the .github/workflows/main.yml portion of the repo.
To configure this you’ll need to update “AZURE_FUNCTIONAPP_NAME” in the main.yml and follow the steps from this guide, specifically “download your publish profile” and “add the GitHub secret”. With that done…
Adding an 18650 battery with JST jack makes this a portable solution for monitoring environmental conditions.
I wanted to give my kids a way to control the colors on their Hue lights in a more fun and tactile way so I used the Arduino MKR IoT Carrier with a MKR Wifi 1010 and wrote a program to let them toggle which light they control, turn it off and on, and randomize the color.
First, we must find the IP address of our Hue Bridge. This can be done using your router. Then connect to the Hue Bridge and register an application. The key here is to get the username. …
Enterprise Architect specializing in data and analytics.