Last weekend has been pretty interesting down here in Sheffy! Me and my brother participated in the local hackaton: HackSheffield (fb event).
Not needing to explain what a hackaton is (previous link to wikipedia opens in a new tab), I’ll describe my point of view. The committee for HackSheffield has been truly amazing. The sponsors, Major League Hacking and The University Of Sheffield very supportive! The organisation was spot on, everything was scheduled with great attention. But let’s proceed with order.
The event was a 24+ gathering of enthusiast developers and curious non-techs. In our case I was the dev and my brother joined me as an interested non-techie.
The sponsors gave brief presentations to open the dances. A team from SkyBet was there, as well as lead-management company DataBowl. Chersoft, Ossila, Wandisco and Huel completed the sponsors lineup.
The hack started at 12 midday on Saturday. Most of the teams were formed already but we found Chris Ingram. Our team focused on one of the datasets provided by the organisation and specifically by Sheffield Uni. It consisted in 64k entries representing human behaviour recorded by sensors. Many of these entries were totally wrong, some were recorded as “unknown” which adds little to no information to the data. Our task was to come up with a solution to reduce the noise in the dataset and to visualise the data. As my brother does not have much experience in code development me and Chris took care of that. Chris developed the visualisation side using node.js, react and gulp. I used python, with numpy and matplotlib to operate on the data.
We set up a Github repo (sadly the data is covered by non-disclosure agreements and could not be shared) and a Devpost project.
The Pasta brothers did some research and came up with a combination of rule-based and classification techniques to reduce the noisiness. We used some simple rules, e.g. collapse an unknown entry if it is between coherent entries. After a couple of hours researching and writing test code for several unsupervised learning techniques (including, k-nearest neighbors and Principal Component Analysis) we decided to use a neural network. We used the features of the dataset as inputs and normalised them (more work needed there) into a numerically stable range. There are a few classes we are trying to capture (e.g. WALKING, IN_VEHICLE, etc.) so we decided to use 6 output neurons (more testing needed here). By splitting the dataset in thirds we could use the first 2/3 for the training phase and the remaining third to test the quality of the classification.
We did not have enough time to carry out an exhaustive testing, as this would have taken too long to mark the correctness of the predictions. We could see from the graphs produced that there was a certain degree of learning but many instances were mis-classified. In my opinion, this is due to the large skew in the class distribution of the data. One class was clearly overrepresented (WALKING) with more than double the instances of the second most common class.
This technical approach won us one of the sponsors prizes. We all got a DBPOWER Hawkeye III Drone. I cannot say how chuffed we were when we received the award and the prize! Really happy times 🙂
The other hacks were truly amazing! The team who won all the rest of the awards developed a visualisation tool to show on a map the events near the user location, pulling that directly from Facebook! Another team used governmental data (huge files describing London 1 square meter at a time) to plot a map of potential solar panel areas. Some people hacked combining Oculus Rift and Leap Motion. Some really cool stuff on controllers using Myo (an armband which reads the electrical activity of the forearm muscles) to control a remote robot. The list of hacks goes on, incredible effort by all the teams and hackers. Some of which did not go to bed the night before and were still on the stage presenting their work.
Apart from the prize winning or the hack itself, what this experience really gifted me with was the inclusivity, the feeling of being part of a wider community of people. The participants had different background and knowledge areas but were striving to create and to do this in a stimulating and creative environment.