Activity Classification with TensorFlow | by Benjamin Griffiths

A note to the reader, this article as a high-level narrative of what happened mixed with a splash of tech. For the nerds like me, the code is available in a Github respository.

Notion is my preferred tool when it comes to managing projects. It’s easy to customise pages to suit any application and has tonnes of built-in features for project management.

I used it throughout the project lifecycle to keep me on track and organised. Take a look at my project structure and the daily logs: Notion page

Given that I had 6 weeks to work on the project, I decided to break it down into the following stages:

Getting started: Planning and exploring the problem.
Creating the data processing software.
Get a basic model working with TensorFlow.
Create the best model possible using research papers.
Deploy the model so that it can be used by others.
Evaluate the work, clean up the code and write this article.

So, we’ve discussed planning and organising the project. Let’s start writing some code.

Like all good machine learning projects, it starts and ends with data.

Luckily I already had data collected from some willing friends who were happy for me to play around with it and present it for this article.

The data was collected from a 3-axis accelerometer worn on the shank and an activity monitor worn on the thigh. My plan was to use the thigh worn device to give me accurate posture measurements throughout the data collection period and try to predict the postures using the shank acceleration.

I needed to separate the data into 15-second windows where each window had 3 axes of acceleration data and a corresponding true posture code. This could have been 1 of 4 postures; sitting, standing, stepping or lying.

To do this, I wanted to create an object-oriented system.

I’m a self-taught programmer. I learned to code so that I could process data for research projects and my passion for it grew from there.

So I’ve had to learn the hard way.

The drawbacks of using different programming methods, the common mistakes you get taught to avoid when learning through formal education.

But although I’ve dabbled in OOP before, I’ve never made a system that I thought truly needed OPP. However, this was a great opportunity. I could picture the different ‘objects’ that would be responsible for the stages of the processing, their methods and their properties.

I created an object that represented the device itself, a ‘posture stack’ that would be responsible for the data labels (if they existed), a dataset that would be responsible for separating out the raw data and a model that would be able to make predictions and assess the accuracy.

These objects could easily be extended, for example, to create different length windows or to use different models to make predictions from the data.

Now all I needed to worry about was loading in the data.

Unfortunately, the files were exported by the device to CSV and the problem with 5 days of 3-axis acceleration data, sampled every 20th of a second, is that you’re left with some large CSVs.

I tried a few different ways of importing the data, but my only option was to separate the data into chunks using pandas. This was slow…. very slow.