As a CS + statistic major student working on my graduation project. For the year 2020, it will be almost obscured if it is not somehow related to machine learning. So I did an experiment with DenseNet architecture, to experiment with different level of dense connections and how that influence network performance. (More about DenseNet read the paper and this article)
However, there is also this requirement by my school to teach my project to my classmates from different majors in a 60 minutes session. So I can't assume them knowing what is a neural network or having the time to give a crash course in machine learning before explaining my project. So what is a better way to explain a technical project to a non-technical audience than using visualization? This reminds me of the amazing TensorFlow Playground I played with while first learning machine learning. So I start with building a similar, though a much simpler playground application.
playground for code specific to this application
The idea of this project is simple, to give users the ability to add and remove dense connections in the network by simply clicking the connection edges and show an instant update of the prediction result.
To simplify the project, all of the models are trained locally using code similar to my dense connections experiment, and all the prediction results are saved into a JSON file. The main change to the model besides being a much smaller one is the final layer activation to Softmax so that the prediction is between 0% and 100% and sum up nicely to one, comparing to the more common use of ReLU and takes the maximum as the first-class label. The web application needs only be a frontend application, handling click input, and updating the page render. The main challenge of this project for me is to learn and use D3.js quickly. Examining the code for TensorFlow Playground, I see their implementation of the network configuration component is based on D3. Though I feel a bit hesitant to start learning and using D3 for a side project with a tight deadline. After searching and going through some alternative network visualization library, it comes clear to me that other higher-level libraries do not give sufficient control so that D3 is the most sensible option.
Another interesting thing I totally missed while making the first iteration of this project is how big the number of possible ways to configure the model. I originally went for a three dense blocks six total dense layers network as in the illustration used in CondenseNet (see figure 2 in the repository). This model with 21 dense connections (the curved edges in the plot) gives a total of
2^21 = 2097152 possible configurations for allowing each dense connection to be set to on and off. Given I plan to pre-train the models locally, this is something pretty impossible to carry out. So I reduced the network to two dense blocks four total layers as in the application. This gives me 10 dense connections and 1024 possible configurations, which takes a few hours on GPU to train all of the 1024 models.