Coupled data streams need to be analyzed together paying particular attention to simultaneity in event-time and other process specific variables controlling the data streams. With the help of a test problem with known exact solutions, we see that the pipeline processing with Beam can accurately reproduce them.
Serving a flask application with gunicorn and nginx on docker… Packaging applications for reproducible results across environments has gotten a great boost with docker. Docker allows us to bundle the application with all its dependencies so that the resulting image can be run anywhere with a compatible docker runtime. The… Read more »
Serving up python web applications has never been easier with the suite of WSGI servers currently at our disposal. Both uWSGI and gunicorn behind Nginx are excellent performers for serving up a Flask app… Yup, what more could you ask for in life right? There are a number of varieties… Read more »
Admit it! All you ever wanted to do in life was to predict. And a decade rolling around doesn’t happen every day, does it? So wear that ridiculous thinking cap if need be and get to work!
The words that are significant to a class can be used improve the precision-recall trade off in classification. Using the top significant terms as the vocabulary to drive a classifier yields improved results with a much small sized model for predicting MIMIC-III CCU readmissions from discharge notes
Querying with high frequency terms improves recall and, the rare terms precision. The significant terms balance both while offering some discriminative capacity among the latent classes the retrieved documents may belong to. The MIMIC-III dataset is studied here in the context of predicting patient readmission from the discharge notes with Elasticsearch driving the significance measures…
Word vectors have evolved over the years to know the difference between “record the play” vs “play the record”. They have evolved from a one-hot world where every word was orthogonal to every other word, to a place where word vectors morph to suit the context. Slapping a BoW on word vectors is the usual way to build a document vector for tasks such as classification. But BERT does not need a BoW as the vector shooting out of the top [CLS] token is already primed for the specific classification objective
Convolutional layers and their cousins the pooling layers are examined for shape modification and parameter counts as functions of layer parameters in Keras/Tensorflow…
Formulae for trainable parameter counts are developed for a few popular layers as function of layer parameters and input characteristics. The results are then reconciled with what Keras reports upon running the model…
Feature space cracking new data introduces potentially useful new classes if detected. Spurts in the rate of increase of new data points with a less than acceptable classification confidence, indicate that new data zones are being carved out in the feature space…