Posts

R Shiny Maths Games for a 6 Years Old

Image
I have been looking at games to help my 6-year-old practise maths. There are definitely free apps and online resources. But I am never happy with their game design. Often it fails to keep my son engaged or it has too much game play but too little maths. It suddenly occurred to me that it should be very easy to build a game in R Shiny. This allows me to customise level of difficulty according to his progress and school curriculum. tailor the feedback to keep him engaged. For example he is very into Pokemon lately and in my app I can pick different GIF's for responding to the right or wrong answer. My son loves this very simple game. What's more, he is very inspired now to write his own game in the future. There are two core components within the server.R script: get_random_question(): randomly select numbers from a range and operations (+, -, x); calculate the true answer to the random question. verdict(): check user's input against the true answer. If correct (wrong), rando

COVID-19: Exponential Growth in London

Image
  I predicted on 25th August a second-wave in London in 2-4 weeks time. It looks like the recent media's coverage and the government's policy reaction are consistent with my prediction. The policy response of rule-of-six-in-social-gathering is mild. Hopefully though, it forces people to realise that they cannot afford to relax too much. Nevertheless, from talking to friends & colleagues, I don't get the sense that people are ready to scale back their activity and social events! I have fitted an exponential curve to the recent data and I get a whopping 98% R2 and <1% p-value (OK - R2 is a bit inflated on overlapping data but still.). Here I added to my graph the fitted value and the model's predicted value for the next four weeks. #covid19 #secondwave The full code below. The code is essentially based on my previous post with the additional exponential model in the middle section. library(data.table) library(ggplot2) data_url <- "https://c19downloads.azure

COVID-19 Update by London Borough

Image
x-axis: latest 7 day average for new daily lab-confirmed cases per 10000 person y-axis: the 30 day increase of the new cases, measured by log ratio The closer to the bottom left the better the borough is doing. I then created a distance metric between the borough and Greenwich as a measure of severity: Severity = Average(Percentile Rank along x-axis,  Percentile Rank along y-axis) which I then set up as a dependent variable. Some suggests there maybe clusters related to Eid festival. However there is actually a negative and insignificant coefficient when I regress vs. ethnicity data. (1) Maybe wealth can be a proxy for oversea mobility and non-white-collar-job. If I use house price (link in the code) as a proxy to wealth, and result again is very weak. No relationship with population density of the borough either. Can you spot any pattern?  (1) https://data.london.gov.uk/dataset/detailed-ethnicity-by-age---sex-ward-tools---2011-census-- Full code below: library(data.table) library(ggpl

The Infinite Dimension of Truth - According to Taoist Philosophy

Image
  "If the truth can be fully described by language, then this is not the truth"  (1) This is the first sentence of the most important Taoist text and in my opinion often overlooked. In this world, the truth has infinite dimension. And if we try to model the truth by language, we are bound to miss something. Analogously, describing the truth is similar to projecting onto a multi-dimensional plane, e.g. Truth = ∑ß¡x¡ + ε where x¡ can be different words or topics. You will always get an error term ε. And hence: "If you observe without any perspective, then you can see the fullness; if you come with a perspective, then you only see what you come to see"  (2) ... which feels particularly true in today's society.  Search engine and social media platform are even strengthening our innate human behaviour. They deliver news feed and search results that you like to click!  What does that mean to a data scientist and even more generally as a person then? Active open-minded

Is second wave coming to London?

Image
  The some columns on the data file has changed so I have updated the code from my last post. It is worrying that the daily number of new cases has risen to the same level as when the lock down started. The speed of the increase remains quite slow however. This can be because but not limited to: The level of immunity in London is higher People are more careful (but, as I observe, more and more relaxed as days go by) Restricted mobility and work from home Better ventilation as the weather has been very warm and people here have no air conditioning. It is therefore quite tempting to conclude the critical threshold for the second wave is going to be higher than roughly 150 people per day when the last lock-down started. I am going to venture a guess - 200-300 people per day potentially? If the trend continues, mindful that the latest dip is due to lag, then the level can be breached in 2-4 weeks time? #data.table #R #covid19 #secondwave Full code below: library(data.table) library(ggplot2

Should I send my kids back to school?

Image
To be honest, the main reason for me to send my kids back to school is to keep the household sane. Nevertheless, the decision should be informed by data. So here we go. I have downloaded the data from the UK government website and filtered the data by region. The UK lock down started 15th March when there were about 6 new cases per day. In the area I am at, the number of daily confirmed cases has come down below 6 per day for some time. I shall caveat that conservatively, we should check all the surrounding areas and wait for 0 cases for at least 2 weeks for all of them to be safe. I have schedule a job to update the graph every day to keep an eye on the number. For now and for my family, the risk is definitely worth taking. #data.table #R #backtoschool #covid19 #parenthood r-bloggers.com library(data.table) library(ggplot2) data_url <- "https://c19downloads.azureedge.net/downloads/csv/coronavirus-cases_latest.csv" raw_data <- fread(data_url, check.names = TRUE) merton_