Reproducibility-as-a-service: can the cloud make it real?

Kenji Takeda, Solutions Architect and Technical Manager with Microsoft Research, has written a blog post on Recomputability 2014, as well as discussing some of the issues (and potential opportunities) for reproducibility in computational science we have outlined in our joint paper (including a quote from me):

This is an exciting area of research and one that could have a profound impact on the way that computational science is performed. By rethinking how we develop, use, benchmark, and share algorithms, software, and models, alongside the development of integrated and automated e-infrastructure to support recomputability and reproducibility, we will be able to improve the efficiency of scientific exploration as well as promoting open and verifiable scientific research.

Read Kenji’s full post on the Microsoft Research Connections Blog.

Tagged , , , , , ,

It’s impossible to conduct research without software

No one knows how much software is used in research. Look around any lab and you’ll see software — both standard and bespoke — being used by all disciplines and seniorities of researchers. Software is clearly fundamental to research, but we can’t prove this without evidence. And this lack of evidence is the reason why we ran a survey of researchers at 15 Russell Group universities to find out about their software use and background.

The Software Sustainability Institute‘s recent survey of researchers at research-intensive UK universities is out. Headlines figures:

  • 92% of academics use research software;
  • 69% say that their research would not be practical without it;
  • 56% develop their own software (worryingly, 21% have no training in software development);
  • 70% of male researchers develop their own software, and only 30% of female researchers do.

For the full story, see the SSI blog post; the survey results described there are based on the responses of 417 researchers selected at random from 15 Russell Group universities, with good representation from across the disciplines, genders and career grades. It represents a statistically significant number of responses that can be used to represent, at the very least, the views of people in research-intensive universities in the UK (the data collected from the survey is available for download and is licensed under a Creative Commons by Attribution licence).

(you may also like to sign this petition and join the UK Community of Research Software Engineers)

Tagged , ,

Accepted papers and programme for Recomputability 2014

I am co-chairing Recomputability 2014 next week, an affiliated workshop of the 7th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2014). The final workshop programme is now available and it will take place on Thursday 11 December in the Hobart Room at the Hilton London Paddington hotel.

I will also be presenting our paper on sharing and publishing scientific models (arXiv), as well as chairing a panel session on the next steps for recomputability and reproducibility; I look forward to sharing some of the outcomes of this workshop over the next few weeks.

The workshop Twitter hashtag is #recomp14; you can also follow the workshop co-chairs: @DrTomCrick and @npch, as well as the main UCC account: @UCC2014_London.

Tagged , , , , ,

Toys for girls and boys

This week an image has been doing the rounds on Twitter, showing a letter to parents printed on the back of a pamphlet from a LEGO set:


Originally posted on reddit, it unsurprisingly went viral but with many questioning its authenticity. However, it has been confirmed as genuine by LEGO UK:

The text is from 1974 and was a part of a pamphlet showing a variety of Lego doll house products targeted girls aged 4 and up. It remains relevant to this day — our focus has always been, and remains to bring creative play experiences to all children in the world…ultimately enabling children to build and create whatever they can imagine.

Don’t forget, you can use this helpful guide to check to see if a toy is for boys or girls.

Tagged , , ,

Some very Pointless maths

If you enjoy mathematics as well as the BBC quiz series Pointless, hosted by Alexander Armstrong and Richard Osman, then you will most definitely enjoy the following blog post by Mathistopheles:

This article is based on a gloriously irrelevant mathematical sequence that is derived (rather appropriately) from the episodes of the television show Pointless. It is the sort of idea that has me scribbling calculations on the back of envelopes for hours on end, despite there being absolutely no hope of an outcome that could in any way justify this investment of effort. In this first part, I introduce the sequence and explain how it is related to some well-known mathematical objects called Markov chains. In the vain hope that I might convey my enthusiasm for this topic to others, I have tried to write this piece in a fairly accessible way. Almost no mathematical knowledge is assumed, beyond a rough idea of what probability is.

It provides a clear and accessible analysis of how the quiz show works by using directed graphs, matrices and Markov chains: read the full post here.

Tagged , , , ,

Computing research

There is nothing to do with computers that merits a PhD.

Max Newman (1897-1984), as quoted in Alan Turing: The Enigma by Andrew Hodges


Tagged , , ,

Roots of integers

An integer is either a perfect square or its square root is irrational. Essentially: when you compute the square root of an integer, there are either no figures to the right of the decimal or there are an infinite number of figures to right of the decimal and they don’t repeat. There’s no middle ground — you can’t hope, for example, that the decimal expansion might stop or repeat after a hundred or so terms.

The proof of this theorem is surprisingly simple, not much harder than the familiar proof that the square root of 2 is irrational.

Suppose \tfrac{a}{b} is a fraction in lowest terms, i.e. a and b are co-prime (i.e. their gcd is 1), and \tfrac{a}{b} is a solution to xn = c where n > 0 is an integer and c is an integer. Then:

(\dfrac{a}{b})^n = \dfrac{a^n}{b^n} = c

and so:

\dfrac{a^n}{b} = c b^{n-1}

Now the right side of the equation above is an integer, so the left side must be an integer as well. But b is relatively prime to a, and so b is relatively prime to a^n. The only way \tfrac{a^n}{b} could be an integer is for b to equal 1 or -1. And so \tfrac{a}{b} must be an integer.

Another way to get the same result is to assume \tfrac{a}{b} is an irreducible fraction and is not an integer (i.e. b \neq 1), and consider (\tfrac{a}{b})^n. Clearly a^n and b^n are co-prime and the denominator b^n \neq 1, so \tfrac{a^n}{b^n} is not an integer.

So what we said about square roots extends to cube roots and in fact to all integer roots (for example, the fifth root of an integer is either an integer or an irrational number). In other words: no (non-integer) fraction, when raised to a power, can produce an integer.

(reblogged from John D. Cook’s blog)

Tagged ,

A rational animal

Man is a rational animal — so at least I have been told. Throughout a long life, I have looked diligently for evidence in favour of this statement, but so far I have not had the good fortune to come across it, though I have searched in many countries spread over three continents.

Unpopular Essays (1950)
Bertrand Russell (1872-1970)


Tagged , ,

Come and do a funded PhD with me

Fancy doing a PhD with me at Cardiff Metropolitan University? I have a fully-funded studentship (for UK/EU students) starting in January, in collaboration with HP in Bristol:

The Department of Computing & Information Systems, Cardiff Metropolitan University, is pleased to offer a fully funded PhD Studentship in Provably Optimal Code Generation.

This research project (Scaling Superoptimisation for Enterprise Applications) is part of an on-going strategic collaboration between Cardiff Metropolitan University and Hewlett-Packard in Bristol; HP is a leading technology company that operates in more than 170 countries around the world, providing infrastructure and business offerings that span from handheld devices to some of the world’s most powerful supercomputers.

Applicants must have an excellent first degree in Computer Science, Computer Engineering, Mathematics or a related discipline, with interests/experience at the hardware/software interface and/or in mathematical foundations.

This three year PhD will commence in January 2015. The PhD bursary consists of the standard tuition fee for a Home/EU student (to be £3,760 in 2014/15) and a stipend linked to the minimum amount set annually by Research Councils UK (currently £13,590 p.a.).

Project Context:

Our world is increasingly dependent on the effectiveness and performance of software. Tools and methodologies for creating useful software artefacts have been around for many years, but the scalability of these systems for solving challenging real world problems are — in many important cases — poor. While there are numerous socio-technical issues associated with developing large software systems, there is a significant opportunity to address the optimisation of software in a strategic, adaptable and platform-independent way.

Superoptimisation is an approach to optimising code by aiming for optimality from the outset, rather than as the aggregation of heuristics that are neither intended nor guaranteed to give provable optimality. Building on previous work by Crick et al., this research project will further develop the theoretical foundations of superoptimisation, as well as developing a scalable toolchain for superoptimising enterprise-level software applications.

For informal enquiries, please send me an email: (but please apply via FindAPhD or here).

Deadline for applications: Friday 31 October.

Tagged , , , , ,

Real world problem solving


Somehow, I don’t think solving this problem will do much for the couple…

(HT Thanks, Textbooks)

Tagged ,

Paper submitted to Recomputability 2014: “Share and Enjoy”: Publishing Useful and Usable Scientific Models

Last month, me, Ben Hall, Samin Ishtiaq and Kenji Takeda (all Microsoft Research) submitted a paper to Recomputability 2014, to be held in conjunction with the 7th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2014) in London in December. This workshop is an interdisciplinary forum for academic and industrial researchers, practitioners and developers to discuss challenges, ideas, policy and practical experience in reproducibility, recomputation, reusability and reliability across utility and cloud computing. It aims to provide an opportunity to share and showcase best practice, as well as to offering a platform to further develop policy, initiatives and practical techniques for researchers in this domain.

In our paper, we discuss a number of issues in this space, proposing a new open platform for the sharing and reuse of scientific models and benchmarks. You can download our arXiv pre-print; the abstract is as follows:

The reproduction and replication of reported scientific results is a hot topic within the academic community. The retraction of numerous studies from a wide range of disciplines, from climate science to bioscience, has drawn the focus of many commentators, but there exists a wider socio-cultural problem that pervades the scientific community. Sharing data and models often requires extra effort, and this is currently seen as a significant overhead that may not be worth the time investment.

Automated systems, which allow easy reproduction of results, offer the potential to incentivise a culture change and drive the adoption of new techniques to improve the efficiency of scientific exploration. In this paper, we discuss the value of improved access and sharing of the two key types of results arising from work done in the computational sciences: models and algorithms. We propose the development of an integrated cloud-based system underpinning computational science, linking together software and data repositories, toolchains, workflows and outputs, providing a seamless automated infrastructure for the verification and validation of scientific models and in particular, performance benchmarks.

(see GitHub repo)

Tagged , , , , , ,

I’m a teapot

Great to see Google adhering to standards and finally implementing RFC 7168: The Hyper Text Coffee Pot Control Protocol for Tea Efflux Appliances (HTCPCPT-EA), published at the start of April this year.

From §2.3.3:

TEA-capable pots that are not provisioned to brew coffee may return either a status code of 503, indicating temporary unavailability of coffee, or a code of 418 as defined in the base HTCPCP specification to denote a more permanent indication that the pot is a teapot.



Tagged , ,

Illustrious company

Yesterday, I saw this quote from the blurb for Jamie Bartlett’s new book The Dark Net:

Beyond the familiar online world that most of us inhabit — a world of Google, Hotmail, Facebook and Amazon — lies a vast and often hidden network of sites, communities and cultures where freedom is pushed to its limits, and where people can be anyone, or do anything, they want. A world that is as creative and complex as it is dangerous and disturbing. A world that is much closer than you think.

The dark net is an underworld that stretches from popular social media sites to the most secretive corners of the encrypted web. It is a world that frequently appears in newspaper headlines, but one that is little understood, and rarely explored. The Dark Net is a revelatory examination of the internet today, and of its most innovative and dangerous subcultures: trolls and pornographers, drug dealers and hackers, political extremists and computer scientists, Bitcoin programmers and self-harmers, libertarians and vigilantes.

Based on extensive first-hand experience, exclusive interviews and shocking documentary evidence, The Dark Net offers a startling glimpse of human nature under the conditions of freedom and anonymity, and shines a light on an enigmatic and ever-changing world.

Computer science: an innovative and dangerous subculture indeed!

(N.B. I have not read this book)

1994 Web Flashback


Remember, this is pre-IE 1.0, viewed with (Mosaic) Netscape or Opera (well, or Lynx — see the web browser timeline).

…and it’s still alive and kicking!

Tagged , ,

Simon Jenkins on computer science

In a polemic in The Guardian today, Simon Jenkins argues for a(nother) shake up of the UK’s education system, with less focus on STEM and computer science in particular.

This kind of misinformed ranting on the utilitarian view of STEM and why the UK should focus on being a service industry appears to be his CiF modus operandi — see a similar post from February on mathematics education. In particular, he displays a profound misunderstanding of the difference between digital skills/competencies and the rigorous academic discipline of computer science, as well as a lack of awareness of the profound changes to computing education in England from September for all pupils from age five onwards. He also doesn’t appear to be aware of the increasing demands from pretty much every industrial sector for high-value digital skills (both user and creator skills); see the recently published interim report from the UK Digital Skills Taskforce: Digital Skills for Tomorrow’s World. As for the perceived high unemployment rates for computer science graduates? Well, this isn’t the full picture and is also discussed in detail in the Taskforce report.

While it is tempting to deconstruct and refute his article line by line, I will just link to an excellent response from Chris Mairs, Chief Scientists at Metaswitch Networks and Chair of the UK Forum for Computing Education.

Tagged , , , ,

Generating primes in LaTeX

Inspired by a recent discussion on the wonders of \LaTeX, I started thinking about how easy it would be to generate prime numbers in \LaTeX. Well, unsurprisingly, it was presented as an example by Knuth using trial division in The TeXbook (download) in 1984:


\newif\ifprime \newif\ifunknown % boolean variables
\newcount\n \newcount\p \newcount\d \newcount\a % integer variables
\def\primes#1{2,~3% assume that #1 is at least 3
\n=#1 \advance\n by-2 % n more to go
\p=5 % odd primes starting with p
\loop\ifnum\n>0 \printifprime\advance\p by2 \repeat}
\def\printp{, % we will invoke \printp if p is prime
\ifnum\n=1 and~\fi % ‘and’ precedes the last value
\number\p \advance\n by -1 }
\def\printifprime{\testprimality \ifprime\printp\fi}
\def\testprimality{{\d=3 \global\primetrue
\loop\trialdivision \ifunknown\advance\d by2 \repeat}}
\def\trialdivision{\a=\p \divide\a by\d
\ifnum\a>\d \unknowntrue\else\unknownfalse\fi
\multiply\a by\d
\ifnum\a=\p \global\primefalse\unknownfalse\fi}


% usage
The first 100 prime numbers are:~\primes{100}


You can also do it by sieving; check out the examples in my GitHub repo.

Tagged , ,

Today’s “University View” column in the Western Mail

This is the short article I wrote for the University View column in today’s Western Mail:

Technology is arguably the biggest lever on our lives, affecting everything from the way we communicate, do business, shop, travel, access information and are entertained. Our dependence on digital infrastructure is increasing all the time; from the demand for high-bandwidth Internet connectivity through to the devices we carry in our pockets. We truly live in a computational world, glued together by software.

But the real question is: do we direct technology, or do we let ourselves be directed by it and those who have mastered it? “Choose the former,” writes author Douglas Rushkoff, “and you gain access to the control panel of civilisation. Choose the latter, and it could be the last real choice you get to make”; in essence: program or be programmed.

So why do we have an seemingly antiquated perspective of technology education, primarily focusing on developing increasingly transient IT user skills, rather than equipping young people with a deeper understanding of how technology works, on how to solve problems with technology, on programming and computational thinking skills? Why are we not developing a generation of digital creators, empowered to make, break and manipulate their digital world, rather than a generation who are becoming consumers of technology?

This is a question I have been asking repeatedly over the past couple of years. Last year I co-chaired the Welsh Government’s review of the ICT curriculum, in light of significant reform across the rest of the UK. From September, there will be a new compulsory subject called Computing replacing ICT in England from aged five onwards, focusing on computer science, programming and computational thinking: “A high-quality computing education equips pupils to use computational thinking and creativity to understand and change the world.” Precisely so.

We are currently in the midst of a significant review of education in Wales, asking fundamental questions about what education should achieve for young people. Alongside this we have long term policy evolving around skills, identifying the types of skills we require to create a healthy and prosperous society that is economically secure but also agile and adaptable to changing industries and sectors. While I recognise it is important that we take stock of where we are in Wales and identify the most appropriate solutions to some of our educational problems, it seems bizarre that we are delaying on what appears to be a no-brainer: making digital skills and computing education a core part of our curriculum. It is not a question of rushing into solutions, or copying other countries — this is about creating aspirations for our young people, developing future-proof skills and global competitiveness. I am baffled that we still have to justify why they should be core for all. We should turn the question around: can anyone justify why we shouldn’t make computing a core part of the curriculum?

Ultimately it comes down to what we want a future Wales to look like. Do we want to be a knowledge economy, leveraging our culture and being innovative and creative with technology? The Welsh Government have identified a number of priority sectors for economic renewal, alongside significant investment in our science and engineering research base, as well as recognising the broader societal and economic importance of e-infrastructure, connectivity and digital inclusion. All of these are predicated on having a country and citizenry with high-value digital and computational skills. It currently remains to be seen if we can deliver a digital Wales.

(N.B. text published in the print copy of the paper may differ slightly due to copy-editing)

Tagged , , ,

Get every new post delivered to your Inbox.

Join 378 other followers