A Month In Data, Part V

For the “A Month In Data” blog post series, I curate a set of interesting articles, links and resources that I have come across this month relating to data, algorithms and policy: from data science, AI and machine learning, through to ethics, society and governance. As before, alongside the main list — which is presented in no specific order or precedence — I also offer a set of short links to posts, academic papers and other relevant resources.

Part V: December 2017

In this fifth set of posts, we have everything from AI ethics, algorithms and the law, and emerging UK technology strategy, through to developing predictive capabilities, cyberwar, and generating Christmas carols with neural nets:

A Round Up of Robotics and AI ethics: part 1 Principles
This excellent summary post by Alan Winfield is a round up of the various sets of ethical principles of robotics and AI that have been proposed to date, ordered by date of first publication. The principles are presented here (in full or abridged) with notes and references, but without detailed commentary.
Should we be afraid of AI?
Machines seem to be getting smarter and smarter and much better at human jobs, yet true AI is utterly implausible. Why? After so much talking about the risks of ultraintelligent machines, it is time to turn on the light, stop worrying about sci-fi scenarios, and start focusing on AI’s actual challenges, in order to avoid making painful and costly mistakes in the design and use of our smart technologies. (you might also enjoy this other post by Luciano Floridi: Why Information Matters)
2017 Was The Year We Fell Out of Love with Algorithms
Algorithms that amplify fear and help foreign powers put a finger on the scale of democracy? These things sound dangerous! This a shift from just a few years ago, when “algorithm” primarily signified modernity and intelligence, thanks to the roaring success of tech companies such as Google — an enterprise founded upon an algorithm for ranking web pages. This year, there has growing concern about the power of technology companies, increasingly regarded as our “algorithmic overlords”. Also see: In 2017, society started taking AI bias seriously.
Chasing trains: The UK talks a good AI game but is it losing pace?
In the scramble to promote sectors of British excellence before Brexit, government and industry have galvanised around artificial intelligence. But sowing a flower bed for AI in the UK is more than a matter of selling startups to internet giants, and there is a concern that the UK is resting on a few lush laurels; the country needs to cultivate its soft power in AI — leading by example in how it interrogates the ethical dilemmas posed by new technology.
A Reality Check: Algorithms in the Courtroom
Can imperfect algorithms help address systemic inequalities in the criminal justice system, a way to combat the capricious and biased nature of human decisions? This post addresses three critical questions: how well does pre-trial risk assessment work in practice, what do the tools actually measure, and how are the tools related to the life-shaping decisions reformers care most about? (in a related theme, check out how Lawyer-Bots Are Shaking Up Jobs)
Four posts on further imaginative uses for neural nets:
- Christmas Carols, generated by a neural network
  The Times teamed up with reader/neural net hobbyist Erik Svensson to collect a mix of ancient and modern carols — about 240 carols in all — from “What Child is This?” to “Grandma Got Run Over by a Reindeer”. When the neural network begins learning, it starts with a set of random rules about how to put one letter after another to make a Christmas carol, creating junk. After the neural network has spent many rounds refining its rules, it begins to look a lot like Christmas…
- This Machine Learning Algorithm Can Turn Any Line Drawing Into ASCII Art
  ASCII art is created by using a set of numbers and letters defined in the American Standard Code for Information Interchange; although ASCII art generators have existed for years, they still don’t hold a candle to the intricate ASCII art made by hand. Osamu Akiyama, a Japanese undergraduate medical student at Osaka University and ASCII artist (see his GitHub), has created a neural net that can take any line drawing and use it to render the drawing in ASCII that is comparable to human abilities.
- How to Find Wally with a Neural Network
  Deep learning provides yet another way to solve the Where’s Wally puzzle problem. But unlike traditional image processing computer vision methods, it works using only a handful of labelled examples that include the location of Wally in an image.
- Using Convolutional Neural Networks to detect features in satellite images
  Inspired by Kaggle’s Satellite Imagery Feature Detection challenge, this post explores how easy it is to detect features (roads in this particular case using Dutch open map data) in satellite and aerial images using convolutional neural networks in TensorFlow.
Three posts on a theme of developing (and understanding) predictive capabilities:
- Wisdom of the Crowd Accurately Predicts Supreme Court Decisions
  Crowds can sometimes be wiser than the smartest individuals they contain; now researchers at the Chicago Kent College of Law in Illinois have carried out the largest study of crowdsourcing (using data from FantasySCOTUS) in predicting SCOTUS decisions (also see the arXiv paper).
- Predicting Stock Performance with Natural Language Deep Learning
  Microsoft and a financial services partner have developed a model (using convolutional neural networks running on the Azure Machine Learning Workbench) to predict the future stock market performance of public companies in categories where they invest. The goal was to use select text narrative sections from publicly available earnings release documents to predict and alert their analysts to investment opportunities and risks.
- YouTube Views Predictor
  A comprehensive guide to getting more views on YouTube backed by machine learning. Their goal was to create a model that can help influencers predict the number of views for their next video; due to the sheer scale of the problem, the scope was narrowed to fitness-related videos, creating a predictor that could be useful for moderately sized YouTube channels.
How to break a CAPTCHA system in 15 minutes with Machine Learning
Everyone hates CAPTCHAs – those annoying images that contain text you have to type in before you can access a website; CAPTCHAs were designed to prevent computers from automatically filling out forms by verifying that you are a real person. But with the rise of deep learning and computer vision, they can now often be defeated easily.
Three posts with a cyber security and national security theme:
- How An Entire Nation Became Russia’s Test Lab for Cyberwar
  A hacker army has systematically undermined practically every sector of Ukraine: media, finance, transportation, military, politics, energy. Wave after wave of intrusions have deleted data, destroyed computers, and in some cases paralysed organisations’ most basic functions. They have been part of a digital blitzkrieg that has pummelled Ukraine for the past three years — a sustained cyberassault unlike any the world has ever seen.
- Machine Learning for Cybercriminals
  In the past year, there has been ample information on the use of machine learning in both defence and attacks (especially defence); the objective of this article is systemising information on possible or real-life methods of machine learning deployment in malicious cyberspace. It is intended to help members of information security teams to prepare for imminent threats.
And finally, four posts with a technical theme:
- A Zero-Math Introduction to Markov Chain Monte Carlo Methods
  For many, Bayesian statistics is voodoo magic at best, or completely subjective nonsense at worst. Among the trademarks of the Bayesian approach, Markov chain Monte Carlo methods are especially mysterious. They’re math-heavy and computationally expensive procedures for sure, but the basic reasoning behind them, like so much else in data science, can be made intuitive.
- Difference Between Classification and Regression in Machine Learning
  Fundamentally, classification is about predicting a label and regression is about predicting a quantity; in this tutorial, you will discover the differences between classification and regression.
- Understanding Dimension Reduction with Principal Component Analysis (PCA)
  The “curse of dimensionality” refers to an exponential increase in the size of data caused by a large number of dimensions. As the number of dimensions of a data increases, it becomes more and more difficult to process it. Dimension Reduction is a solution to the curse of dimensionality and Principal Component Analysis (PCA) is one of the most popular linear dimension reduction methods (this tutorial is from a seven-part series on Dimension Reduction).

You might also like…

(check out all of the previous posts in the A Month In Data series)

Digital Society & Policy

Technology, systems and public policy

A Month In Data, Part V

Part V: December 2017

You might also like…

Leave a comment Cancel reply

Part V: December 2017

You might also like…

Share this:

Related

Leave a comment Cancel reply