You're reading the public-facing archive of the Category Theory Zulip server.
To join the server you need an invite. Anybody can get an invite by contacting Matteo Capucci at name dot surname at gmail dot com.
For all things related to this archive refer to the same person.
There is a major story in the UK breaking now about the code behind the Imperial College Model for Covid-19 which I also wrote up in Code, Models and Covid-19, where I put it in a philosophical context with references to mathematical work too (eg. Agda, Coq, Idris, Scala).
Henry Story said:
There is a major story in the UK breaking now about the code behind the Imperial College Model for Covid-19 which I also wrote up in Code, Models and Covid-19, where I put it in a philosophical context with references to mathematical work too (eg. Agda, Coq, Idris, Scala).
Thanks for doing the write-up. I was going to bring it up, but decided against it because I don't feel competent enough to review code. I feel like this should belong in #practice: communication instead of here, since it is partly about peer review in academia, but it's fine here too, I guess.
You wrote:
All of the above is explained in detail in the Code Review of Ferguson’s Model that I have been drawing on. Sadly that review ends with this personal suggestion that shows that the author does not understand the philosophical reasons for why testing and peer review are important, nor what the origin of the problem was.
On a personal level, I’d go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people, and the results speak for themselves.
I think the author has other reasons for writing what they did. Many lockdown skeptics are libertarians and have been pushing this story for a while now to score points.
He/she(?) has other reasons or is formatted in a different ideology. The author worked at Google it seems, which should have made them aware that Google is built on Open Source coming from universities as well as private research. (Sun Microsystems stood for Stanford University Network and was built on Berkeley System Development Unix)
I'm sure there are libertarians in Google as well, and the fact that the author had worked at Google in the past suggests that they may have left after becoming disaffected by the "groupthink" at Google.
Yes, but one can (as I attempt to do in the blog post) separate the factual (the state of the code) from the interpretation here. It is actually a great way to make the case for Open Source and Peer Review. Note: that I have not had time to really look at the code myself. I am essentially disagreeing with that interpretation.
Henry Story said:
Yes, but one can (as I attempt to do in the blog post) separate the factual (the state of the code) from the interpretation here. It is actually a great way to make the case for Open Source and Peer Review. Note: that I have not had time to really look at the code myself. I am essentially disagreeing with that interpretation.
And it's a good thing you did as well. There are a few things going on in the blog you cited, but I also think the case of the ICL code makes a good case for open source and peer review. You should read the threads over on #practice: communication: there were some really vigorous discussions there.
Also #general > Shaky foundations, which led to #practice: communication being started.
very much a "shaky foundations".
Some of the stuff that was said in that topic was absolutely astounding to me. Seriously.
The real wtf is that code used by scientists is used as the basis of policy. I'd never expect an epidemiologist to be a competent programmer, and I'd never expect a programmer to be a competent epidemiologist. Both are full time and very specialised fields of expertise
I would never blame a scientist for writing bad code, because writing good code is something that scientists are not trained to do, and do not receive any benefits from it. However, I blame the UK government for using the results of the code to guide policy without having it rewritten by a team of professional software developers
This got me thinking about our plans to transform scientific computing using category theory. I think a lot of that is about allowing programmers to write good code directly by bring the programming language up to the level of abstraction they already think at. I think the ability to base policy (or "mere" scientific fact) on code written by scientists is something we aspire to be possible in the future, but it's clearly very far from possible now
Imperial College has a renowned Computer Science department. (I went to a course in 1993 or so on Category Theory by Abbas Edelat. But I was too short on time to be able to follow) So they should have at least gone over there to look at what was happening.
The good news from this is that one can now explain to liberal/conservative "private enterprise" folks just how expensive non-open source code can be, and the value of peer review. It should be a requirement. Had they done so, they would certainly have gotten a lot of useful feedback.
Jules Hedges said:
This got me thinking about our plans to transform scientific computing using category theory. I think a lot of that is about allowing programmers to write good code directly by bring the programming language up to the level of abstraction they already think at. I think the ability to base policy (or "mere" scientific fact) on code written by scientists is something we aspire to be possible in the future, but it's clearly very far from possible now
yes, that is the main point I make in my blog post: C or C++ were clearly way too low level a programming language for what they were doing. There are many much better programming languages they could have worked with: Scala, Haskell, ... come to mind, even Java would have been better. Then there is ample room for a perfect modeling language of the future still to be invented.
I've now seen 2 "code reviews" of the Imperial code. The first wasn't actually a code review at all but a rant. This one is a bit better, but it's on a site with "skeptics" in the name, which nowadays means anti-science conspiracy theorists
Certainly I don't trust either of them for a second
I'd like to see an actual code review written in a neutral tone
Well my blog post has a neutral tone. So you should like it. :-)
At least one """good""" thing comes out of this -I'm definitely going to refer to this whole thing in some future grant application
Many of these emerged from work to help bring mathematical certainty to programming or indeed dually to help automate and verify mathematical proofs.
in the informal sense this could be considered a duality, but i believe the curry-howard correspondence is covariant, not contravariant
honestly it would probably have been better for the model to have remained in fortran. i think a modern rewrite would have aimed for a python or julia notebook, those seem the rage these days
I was actually wondering if I should use "dually" when I wrote that.
I have heard about Julia. If the aim is to go for mass parallelism, to model the world better, and use more computing power, then the Akka actor framework in Scala could have worked too. Mind you then the code would have I believe had to give up on determinism. The code is criticised for the indeterminacy in the single threaded model. In the multi-threaded model I think some indeterminism would be unavoidable, as message passing does not happen in a deterministic way.
Jules Hedges said:
I would never blame a scientist for writing bad code, because writing good code is something that scientists are not trained to do, and do not receive any benefits from it. However, I blame the UK government for using the results of the code to guide policy without having it rewritten by a team of professional software developers
AFAIK the scientist(s) running a lab would usually hire software developers to write code for them. I know professional software developers working for university research teams who're writing high-performance code in low level languages like C++.
So the question becomes: why did Ferguson's lab, which must surely employ some professional software developers, produce bad code?
Scratch that: apparently Ferguson did write his own code.
It is entirely possible to write clean code in C++, by the way. Robert C Martin wrote the book on it. One interesting project for ACT would be to see if the concepts in his book can be translated into CT language. It is also eminently possible, of course, that this has already been done in the CS literature.
I’m conscious that lots of people would like to see and run the pandemic simulation code we are using to model control measures against COVID-19. To explain the background - I wrote the code (thousands of lines of undocumented C) 13+ years ago to model flu pandemics...
- neil_ferguson (@neil_ferguson)Henry Story said:
If the aim is to go for mass parallelism, to model the world better, and use more computing power, then the Akka actor framework in Scala could have worked too.
I've heard complaints about Scala, and that's coming from FP people.
Scala is a hybrid OO-FP language with dependent types. So FP folks who don't realise that OO in a coalgebraic, may not realise that it too can have solid CT backing.
The complaints don't really sound like the "my paradigm is better than yours" type. I don't really understand the complaints either, but it seems like there are a lot of "traps" for the unwary programmer.
Jules Hedges said:
This one is a bit better, but it's on a site with "skeptics" in the name, which nowadays means anti-science conspiracy theorists
It does reference multiple Github issues, so perhaps that could be used as a guide to the changelog?
I don't see any claim by Ferguson that the model is predictive (i.e. predicts with a certain probability and accuracy what will happen, given certain parameters), but rather that the model suggests scenarios that might happen. On the other hand, the criticism from the "lockdownsceptics" code review seems to be based on the model being predictive. Isn't this just some misunderstanding?
Code being free of bugs is always important, but I would guess it is even more relevant for predictive models.
Jens Hemelaer said:
On the other hand, the criticism from the "lockdownsceptics" code review seems to be based on the model being predictive. Isn't this just some misunderstanding?
It's not that. The criticism is that it's giving unpredictable output that's characteristic of bad code, regardless of whether the model is supposed to be predictive or not. There are multiple references to Github issues that detail problems with the code. See also the criticism in issue 165 on Github, particularly this comment from a modeller who claims to have had extensive experience in industry.
Jules Hedges said:
Certainly I don't trust either of them for a second
Unfortunately, both David Davis and Steve Baker, who claims to be a software engineer, have jumped on this "code review": https://twitter.com/SteveBakerHW/status/1258165810629087232
But hey, more power to your future grant application. :+1:
.@DavidDavisMP is right. As a software engineer, I am appalled. Read this now. https://twitter.com/DavidDavisMP/status/1258143326764761088
- Steve Baker MP (@SteveBakerHW)Rongmin Lu said:
It's not that. The criticism is that it's giving unpredictable output that's characteristic of bad code, regardless of whether the model is supposed to be predictive or not. There are multiple references to Github issues that detail problems with the code. See also the criticism in issue 165 on Github, particularly this comment from a modeller who claims to have had extensive experience in industry.
Yes, it's bad code, but the fact that it isn't predictive anyway can explain why Ferguson still felt comfortable "using it" while advising the government (regardless of whether that was a good call or not).
Jules Hedges said:
This one is a bit better, but it's on a site with "skeptics" in the name, which nowadays means anti-science conspiracy theorists
Scientific advance requires skeptics too. Indeed Popper argued that in science all one can do is falsify theories.
I was reading about that just a few weeks ago after bumping on this paper brings Dual Intuitionistic Logic and a Variety of Negations: The Logic of Scientific Research which got me interested in co-Heyting algebras.
And then one it is not because some people put up very bad arguments for a point of view that there are not much better arguments for it. One of Nietzsche's aphorisms goes: "The most perfidious way of harming a cause consists of defending it deliberately with faulty arguments." Though in this case the widely known "Do not attribute to malice what can be explained by incompetence" is more relevant. In fact in the case of this epidemic, there are simply very many unknowns. So there it is quite reasonable to have different well informed people come to different conclusions.
we're far off topic but the "code reviews" of the imperial model are deeply embarrassing and remind me of the "climategate" stuff https://philbull.wordpress.com/2020/05/10/why-you-can-ignore-reviews-of-scientific-code-by-commercial-software-developers/
Having sound methodologies and reproducibility are things where scientific code can be much better. However, the attempts to extrapolate amateur-hour "code review" onto the imperial model are just about discrediting genuine research by holding it up to invented standards, by people not familiar with the norms (as much as they can be improved), practices, and methods of the field they're dealing with.
they also, like some of the discussions here on various topics, have idealized what "good" software practices are conducted in industry.
an analogy would be that suppose you had a socially agreed on proof of a mathematical fact people didn't like -- let's say that the equations showed that a rocket was going to blow up, and so you shouldn't use the rocket, but a lot of people really liked the rocket, and thought it would do something that would make them money, like, say transport valuable minerals from an asteroid. So now the people who want to use this rocket, damn the risk, and get their asteroid minerals, well they say "this proof is nonsense. it's written on paper and in tex, but it is using theorems that are not in a proof assistant, and are not computer verified! it has steps that reduce equations, but the reductions are not shown to be sound in any topos with synthetic differentials, " etc. Well. I mean one might agree that in general it would be a wonderful goal to have an automated and synthetic theory of this stuff, and that we dream of it. But in the meantime, the proof still seems to indicate maybe one shouldn't use the explodey rocket?
There is something similar going on here.
It would be great if all this stuff were better documented and engineered to help knowledge, to help things stand up to criticism (including from know-nothings), etc. It is an important goal (and there's important work being done on reproducible scientific computing -- one example from someone i know -- https://kar.kent.ac.uk/57488/) but it is not being raised in this case out of pure ends, but as a way to justify philistinism.
Jens Hemelaer said:
Yes, it's bad code, but the fact that it isn't predictive anyway can explain why Ferguson still felt comfortable "using it" while advising the government (regardless of whether that was a good call or not).
All models offer some form of prediction, whether they be confidence intervals or scenarios. I think you have stated what Gershom has just called an "invented standard" of what "predictive" means. The idea that models cannot be said to be "predictive", if they don't predict "with a certain probability and accuracy what will happen, given certain parameters", would exclude too many models that inform policy-making from being considered predictive.
Gershom said:
we're far off topic but the "code reviews" of the imperial model are deeply embarrassing and remind me of the "climategate" stuff https://philbull.wordpress.com/2020/05/10/why-you-can-ignore-reviews-of-scientific-code-by-commercial-software-developers/
Phil Bull's critique is equally embarrassing. Whatever happened to staying in his lane, since he's so keen on lecturing software developers for not staying in their own? And if an astrophysicist can comment on epidemiology code, can I also consider the opinion of Ben Lewis, another astrophysicist, or is his opinion also an attempt to "justify philistinism"?
Here's another critical blog, this time by Chris von Csefalvay, who claims to be "a clinical computational epidemiologist" on his website. I'll dissect Bull's blog later, but for now, here are excerpts from von Csefalvay, who thankfully doesn't attempt any political "code review".
Henry Story said:
One of Nietzsche's aphorisms goes: "The most perfidious way of harming a cause consists of defending it deliberately with faulty arguments."
Oh boy, you'd love this blog by Phil Bull, an astrophysicist at QMUL, that Gershom posted.
Gershom said:
Having sound methodologies and reproducibility are things where scientific code can be much better. However, the attempts to extrapolate amateur-hour "code review" onto the imperial model are just about discrediting genuine research by holding it up to invented standards, by people not familiar with the norms (as much as they can be improved), practices, and methods of the field they're dealing with.
I don't know what you mean by "invented standards": standards are (mostly?) invented by people, so "invented" is superfluous.
I presume you meant "arbitrarily high standards", but these are not arbitrary: these are the standards that one would derive from basic principles of software engineering, which were themselves derived from prior experience of how to develop "good" code by the programming community. That they're not uniformly adhered to in the commercial setting does not imply that they are arbitrary; rather, that it's symptomatic of the lack of enforcement of such standards.
And going by the above paragraph, you're suggesting the "norms" or the status quo in scientific coding includes the use of unsound methodologies which hinder reproducibility. I agree, and that is the sum total of Henry's and my concern, really.
That there are others who would use this to further other undesirable agendas is unfortunate, but if the methodologies in scientific coding still have room for improvement when it comes to their soundness, that misfortune will still persist as long as people insist on open source as a virtue and freedom of information requests (link is to a FOIA request for the ICL code to be made public) are still a thing.
Reading with interest the various linked to blog posts above.
Gershom said:
"this proof is nonsense. it's written on paper and in tex, but it is using theorems that are not in a proof assistant, and are not computer verified! it has steps that reduce equations, but the reductions are not shown to be sound in any topos with synthetic differentials, "
This is a strawman. Phil Bull has actually highlighted the non-deterministic output as a concern that he considers to be the most legitimate from "Sue Denim":
This is the most important one, as it could, in particular circumstances, be a valid criticism.
However, Bull claims that the problem isn't fatal, because the developer is apparently aware of the bug:
The bug is not unknown. A particular workaround here appears to be re-running the model many times with different seeds, which is what you’d do with this code anyway; or using different settings that don’t seem to suffer from this bug. My guess is that the “false stochasticity” caused by this bug is simply inconsequential, or that it doesn’t occur with the way they normally run the code. They aren’t worried about it — not because this is a disaster they are trying to cover up, but because this is a routine bug that doesn’t really affect anything important.
This is only mildly reassuring, though, since Bull qualified his argument as a "guess".
One of the problems with the code that I developed at length is that the way it is written made it completely unscalable: ie able to use only 1 thread when say the modern Oracle M8-8 computer with 8 CPUs can run a little over 2000 threads in parallel (and this could be increased indefinitely with enough money or cloud computing resources). So the way it was written was harmful to the efficiency of the research project itself. Furthermore it would have been difficult for them to change the code to improve their model.
Jens Hemelaer said:
I don't see any claim by Ferguson that the model is predictive (i.e. predicts with a certain probability and accuracy what will happen, given certain parameters), but rather that the model suggests scenarios that might happen.
I think this assertion is mistaken:
The major challenge of suppression is that this type of intensive intervention package – or something equivalently effective at reducing transmission – will need to be maintained until a vaccine becomes available (potentially 18 months or more) – given that we predict that transmission will quickly rebound if interventions are relaxed. (p. 2)
In total, in an unmitigated epidemic, we would predict approximately 510,000 deaths in GB and 2.2 million in the US [...] (p. 7)
Social distancing of high-risk groups is predicted to be particularly effective at reducing severe outcomes given the strong evidence of an increased risk with age, though we predict it would have less effect in reducing population transmission. (p. 15)
You may insist on your arbitrarily high standard of what a "predictive" model is, but to a reader of this report and to the general public, Ferguson has been claiming that his model makes predictions. Again, I'd consider projected scenarios to be predictions as well, and I think the general public, particularly the politicians, would agree.
Rongmin Lu said:
Jules Hedges said:
I would never blame a scientist for writing bad code, because writing good code is something that scientists are not trained to do, and do not receive any benefits from it. However, I blame the UK government for using the results of the code to guide policy without having it rewritten by a team of professional software developers
AFAIK the scientist(s) running a lab would usually hire software developers to write code for them. I know professional software developers working for university research teams who're writing high-performance code in low level languages like C++.
What a strange idea. A university research group with enough funding may take on someone with programming experience to implement a theoretical model or to develop an existing model, if such a venture was deemed sufficiently valuable. But before this pandemic, the value of epidemiological models would largely have been considered theoretical; if someone told me that there simply weren't the financial or academic incentives to develop this model further, it wouldn't shock me.
As for enforcing coding standards... I only completed my Masters two years ago, and during my entire time on a maths course at university (things are specialised from the start in most UK degree courses) the programming training barely extended beyond the basics I had picked up out of interest before going to university. Training as to programming standards was non-existent, and even if it had been provided, unless compulsory I can tell you with confidence that the attendance would have been virtually zero. There's no way things could have been better 13 or more years ago. So while I agree with the ideal, I don't agree with the expectations: epidemiologists hadn't had time to develop new or sophisticated computational models (let alone theoretical ones) between the start of the outbreak and the time of the report. They used the models they had to hand, bugs and all. Yes, if the code had been open source, others could have examined the implementation sooner, and some of the bugs could have been fixed shortly thereafter, agreed. But academics aren't trained or employed to write beautiful or perfect code, so expecting that in a literal emergency is unrealistic and unfair.
Yes. In summary: The Real WTF is basing policy on code written by scientists
Rongmin Lu said:
I think this assertion is mistaken:
You're right about this, I shouldn't have made this claim.
@Jules Hedges I guess? I was going to point to you saying that before, but policymakers didn't have any more time to prepare either. It could have taken several extra days (at a generous minimum) to even get a proposal through government to commission a professionally constructed implementation of a model, let alone the time required to produce such a model. Given how inexcusably underprepared they were by the time of the report (a different story, and the real wtf imo), it was similarly necessary for them to rely on these models provided by academic epidemiologists. They have had time since then to retroactively do that work, though; has that happened? :thinking:
Those models have been used over the past 20 years to inform policy. I point in my blog to an article "6 questions" that gives previous cases where the model overestimated dangers.
And if you think this is bad, wait till you hear about computational economics
Policy must be based on science. But science is based on peer review. And peer review requires open source. The tools used by scientists must be criticiseable and alterable.
Jules Hedges said:
Yes. In summary: The Real WTF is basing policy on code written by scientists
I partially disagree with this conclusion: having spent a reasonable amount of time in a team whose goal is to provide formally verified code for real world applications (and achieving to do so, e.g. some of the crypto primitives used in firefox comes directly from this work) I would say that it depends a lot on the expected finality of the scientific work.
Well yes, I mention a number of quality software projects that were written at universities or polytechs. (Someone must be keeping a good list on these somewhere)
BSD Unix (Berkley), X-Windows (MIT), the Mach kernel (Carnegie Mellon), Haskell GHC (Glasgow), Scala (EPFL), the Coq proof assistant (INRIA), Agda (Chalmers), ...
Scala runs Twitter, so that is pretty scalable.
What should be done is that more universities produce such quality code and be rewarded for it. Researchers should not just be rewarded for paper citations, but also code citations (re-use of libraries), and code use.
It would be interesting if mathematicians could also be rewarded when programs use their discoveries.
The c++ code written at imperial is quite complex. Given that complexity, the few extra steps to automate tasks of building, testing, etc... which are not that complicated (people in Industry can understand them, so that's just to say how little you need to know!) would actually have made their work a lot easier.
Indeed the problem is that their code is likely needlessly complicated, because well it has to deal with memory management, pointer arithmetic, ... You actually can watch talks by @Bartosz Milewski where he rants about how much more complicated C++ makes the life of a developer. And his background is c++.
I didn't look at the code, but I'd make a guess that the biggest underlying problem is that the model and the solver are tightly coupled
Now that's the type of argument that could make for a positive constructive criticism, that goes beyond those initial ones. :-)
Btw. here is a fun story. Around 1997 I was at AltaVista and tried to port the BabelFish machine translation to use the Java Web Server. At some point I got a bit concerned as their code was not open source, as I could not tell how much memory the server would allocate to an HTTP header. So I wrote a shell script that created either one infinitely long header or an infinite number of headers. And indeed the server crashed with a memory exception. I sent that script in. Not long after Sun Microsystems open sourced their server.
I met a developer at the JWS stand at Java One and told them about that bug. They knew very well about that incident. The first thing they tried to know is if I had used a byte code decompiler...
So Sun management must have worked out that the code being closed was what led the developers to think that they did not need to be careful, and that the better policy for them was to open it.
Morgan Rogers said:
What a strange idea. A university research group with enough funding may take on someone with programming experience to implement a theoretical model or to develop an existing model, if such a venture was deemed sufficiently valuable. But before this pandemic, the value of epidemiological models would largely have been considered theoretical; if someone told me that there simply weren't the financial or academic incentives to develop this model further, it wouldn't shock me.
Yes, what a strange idea it is. And how strange it was that Phil Bull, instead of mounting an offensive against the revolting idea that academic modelling of epidemics be defunded, by arguing that more funding should be allocated so that academic modelling can be better resourced, should choose to go on the defensive and argue for the status quo, which is a direct result of this paucity of funding that you have so astutely observed.
Morgan Rogers said:
Jules Hedges I guess? I was going to point to you saying that before, but policymakers didn't have any more time to prepare either. It could have taken several extra days (at a generous minimum) to even get a proposal through government to commission a professionally constructed implementation of a model, let alone the time required to produce such a model. Given how inexcusably underprepared they were by the time of the report (a different story, and the real wtf imo), it was similarly necessary for them to rely on these models provided by academic epidemiologists. They have had time since then to retroactively do that work, though; has that happened? :thinking:
That's happening now. In March 2020, after Julian Todd made a FOIA request for the ICL code, Neil Ferguson tweeted a bit of an apologia, and got people from Microsoft/Github to look at the code. What you're seeing now is the fallout from the public release of the code on Github.
I’m conscious that lots of people would like to see and run the pandemic simulation code we are using to model control measures against COVID-19. To explain the background - I wrote the code (thousands of lines of undocumented C) 13+ years ago to model flu pandemics...
- neil_ferguson (@neil_ferguson)I added the links to the three blog posts mentioned above (one defending the code, the two others criticizing it) with little overview of their content to my blog post. One point I made that has not come up here yet is the following:
open sourcing the code early would have immediately led to improvements in it, as those are actually quite obvious ones. When Linus Torvalds started writing the Linux kernel in 1991 it was on his own admission a sketch of an OS. Putting it online allowed it to grow through community feedback in the form of patches that came in from all over the world. This improvement led to it being adopted by the Stanford students that then went on to start Google, where it runs to this day (and much improved) their whole infrastructure. It is now running all Android Phones, huge percentage of servers, and has been ported to run on every conceivable chip.
Henry Story said:
One of the problems with the code that I developed at length is that the way it is written made it completely unscalable: ie able to use only 1 thread when say the modern Oracle M8-8 computer with 8 CPUs can run a little over 2000 threads in parallel (and this could be increased indefinitely with enough money or cloud computing resources). So the way it was written was harmful to the efficiency of the research project itself. Furthermore it would have been difficult for them to change the code to improve their model.
Henry what you wrote is wrong, even just reading that ticket. The code as written _can_ scale. In fact, the nondeterminism was a bug introduced and then fixed over a span of _days_ that _only_ affected the code when it was running in parallel mode! I.e. the (temporary) nondeterminism was a case of a subtle issue introduced precisely by parallelism. Please don't write with confidence when you don't know what you're saying.
I was going on the paragraph of the code review that starts with this:
Imperial advised Edinburgh that the problem goes away if you run the model in single-threaded mode, like they do.
But if that is put into question, I'll look into it in more detail.
Right, that problem went away. But it was also fixed in multithreaded code! So it seems irrelevant. The people trying to make it relevant are doing so towards malicious ends, using sophistry to delegitimize the consensus agreed on by many models and modeling groups, and also by now confirmed by like death tolls and etc., that absent a lockdown covid would spread widely and kill many people.
Ok, I have removed the mistaken sentences relating to the code not being parallesable. Instead I developed a bit of the history of how the programming community moved from stateful Object Oriented programming to functional programming with immutable data structures due to the huge increase in the number of available threads on modern CPUs. I cite @Bartosz Milewski 's book "Category for Computer Programmers" and the article "A Computational Science Agenda for Programming Language Research” which I was pointed to. This helps explain the bug in the IC multithreaded code: it is just really difficult to do multithreaded programming in a stateful OO language.
Jens Hemelaer said:
I don't see any claim by Ferguson that the model is predictive (i.e. predicts with a certain probability and accuracy what will happen, given certain parameters), but rather that the model suggests scenarios that might happen. On the other hand, the criticism from the "lockdownsceptics" code review seems to be based on the model being predictive. Isn't this just some misunderstanding?
I am just looking at the Report 13 - Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries by Imperial and others. Table 2 is entitled "Total forecasted deaths since the beginning of the epidemic up to 31 March in our model and in a counterfactual model (assuming no intervention had taken place)"
So these are counterfactual models. (discussed elsewhere here), as they have to be, given that a model can influence policy which then would change reality. Counterfactuals are also I guess known as scenarios.
Actually on closer inspection it looks like they have a model for what happened at the time of publication (30 March) and the counterfactual model for what could have happened had nothing been done.
I wanted to look for the predictions they make on Sweden, as that is the country that decided to go without lockdown. But it is time to go to sleep here in Germany.
Henry Story said:
I wanted to look for the predictions they make on Sweden, as that is the country that decided to go without lockdown. But it is time to go to sleep here in Germany.
From my readings, I had the impression they did not make predictions for Sweden.
Jules Hedges said:
Yes. In summary: The Real WTF is basing policy on code written by
scientistspeople who're not trained in good software engineering practice.
FTFY
Kenji Maillard said:
[...] I would say that it depends a lot on the expected finality of the scientific work.
The key word is "expected", which is the problem. Presumably, Ferguson did not expect to gain so much attention before, and so the code languished unexamined for 13+ years. Unfortunately, things tend to "scale" unexpectedly these days, and so it is more imperative than ever to adhere to good software engineering practices when writing any code.
Sadly, the "not invented here" mentality, cultural inertia and a lack of resources would probably mean the status quo remains the same for scientific computing in academia for years to come.
Rongmin Lu said:
Henry Story said:
I wanted to look for the predictions they make on Sweden, as that is the country that decided to go without lockdown. But it is time to go to sleep here in Germany.
From my readings, I had the impression they did not make predictions for Sweden.
They do in Report 13 - Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries but only for one week ahead. A meteorologist contact of mine told me
They estimated the reproduction no. R_0 in late march to be 2.64, with a 2.5% lower bound of 1.5, after all the current social distancing had been in place for some time. At that level the epidemic would have raced away. But it did not
I can see some good-faith pushback to Bull's blog in the comments there. Unfortunately, it does seem like he was more interested in hitting back and scoring points, than at being constructive. Yes, Sue Denim's being provocative, but saying that you can ignore "commercial" "software developers" in your headline --- which is presumably what most people only read these days --- risks alienating way too many people who could've been your allies, and plays right into the provocateur's hand.
@samb8s @mike_peel I agree -- a bit more training in software development techniques would go a long way! But laziness != incorrectness, and this is the key message to get across in the face of these politically motivated attacks.
- Phil Bull (@philipbull)What follows is going to be a detailed analysis of Bull's blog, as I've promised earlier. It makes me really uncomfortable that someone who's worked on documentation, and is interested in outreach, has actually said what was being said in that blog.
He claims that scientific code needs to satisfy the following three criteria:
Even at first glance, aren't all of these desirable qualities in any code?!
By his definition of "scientific correctness", that's just "correctness" in ordinary parlance. I don't know of any (functional) programmer who doesn't want a mathematically and logically correct representation of the (business) model and a correct handling and interpretation of input.
With flexibility: that's exactly what code that is well-documented and well-maintained allows you to do, along with good software engineering practices.
I'm not going to touch performance, but again, I'm sure there's decades worth of expertise out there that I'm not aware of.
On to the "last four points [that] will horrify most software developers"... ooh, I can't wait.
Maintainability: Most scientific codes aren’t developed with future maintainers in mind. As per John Cook, they are more likely to be developed as “exoskeletons” by and for a particular scientist, growing organically over time as new questions come up.
Except that the set of future maintainers includes future you, dear scientist. And future you 13+ years from now is not going to remember a certain decision you've made now that's causing that funky bug you would've spent hours in that future trying to hunt down.
(Keeping track of tenses in time-travel scenarios is hard!)
Scientific code gets worked on for a long time, which automatically makes maintainability an issue. This was certainly the case for the ICL code.
Documentation: Providing code and end-user documentation is a good practice, but it’s not essential for scientific codes.
Not invented here. Moving on.
User-proofing/error checking: [...] In essence, the user is assumed to understand all of the (known/intended) limitations of the code and its outputs. This will generally be the case if you run the code yourself and are an expert in your particular field.
Again, dear scientist, is future you going to remember all the limitations of the code and its output? What about unintended limitations? (See also: formal testing)
This also doesn't scale. Wouldn't this obstruct scientific collaboration? Open science was something that came to the fore in this pandemic, with online biomedical research archives disseminating research at the speed of light. Isn't scientific collaboration something we should promote further, even after the pandemic is over?
Formal testing: Software developers know the value of a test suite: Write unit tests for everything; throw lots of invalid inputs at the code to check it doesn’t fall over; use continuous integration or similar to routinely test for regressions. This is good practise that can often catch bugs. Setting up such infrastructure is still not the norm in scientific code development however. So how do scientists deal with regressions and so on? The answer is that most use ad hoc methods to check for issues.
Yup, nothing to see here. :face_palm:
Here, he's confusing testing the results produced by the code (which he's treating as a black box) against the model for "scientific" correctness, versus testing the code itself. Of course the scientist-user would do the former; the developer would be responsible for the latter. Usually, best practices would suggest separation of concerns here, but lack of resources to implement this separation is probably a common issue.
I could go on, but hopefully this is enough to establish my case — the author of that article is out of their depth, and clearly unaware of many of the basics [...]. In fact, they are so far out that they don’t even realise how silly this all sounds to someone with even a cursory knowledge of this kind of thing — it is an almost perfect study in the Dunning-Kruger effect.
Ok.
How they reached the conclusion that scientists must be so incompetent that “all academic epidemiology [should] be defunded” and that “This sort of work is best done by the insurance sector” is truly remarkable
For this to actually be the case, though, the Imperial group must have (a) evaded detection for over 10 years from a global community of competing experts; (b) be almost criminally negligent as scientists, by having ignored easily-discovered but consequential bugs; (c) be almost criminally arrogant to suppose that their unchecked/flawed model should be used to inform such big decisions; and (d) for the entire scientific advisory establishment to have been taken for a ride without any thought to question what they were being told.
(a) That's easy to do, given that their code wasn't open-sourced until now. As long as the results are "scientifically" and directionally correct, how is anyone in the world who isn't part of that group going to know of any flaws in the code?
(b) This is debatable.
(c) Their model has been used to inform big decisions. Whether it's flawed or not is debatable, but the code was unchecked.
(d) See the replication crises.
Rongmin Lu said:
How they reached the conclusion that scientists must be so incompetent that “all academic epidemiology [should] be defunded” and that “This sort of work is best done by the insurance sector” is truly remarkable
- They have a political agenda.
- The insurance industry spends a lot of money getting the modelling right, because it will cost them money if they get it wrong. Hence, the people working in that sector have enough resources to engage in the best practices that's being rejected here.
They definitely have an agenda. It is true that they can loose a lot of money if their models are not right, so they have an incentive structure that is built up correctly, where it is not it seems in Academia.
But really there should be nothing stopping universities and polytechs working together with industry on open source projects to build better modeling platforms (see Scala, Linux, .... ). The ideology that only private enterprise can do these things right is the dangerous one. So there are rotten arguments on both sides.
Henry Story said:
Rongmin Lu said:
- The insurance industry spends a lot of money getting the modelling right, because it will cost them money if they get it wrong. Hence, the people working in that sector have enough resources to engage in the best practices that's being rejected here.
But really there should be nothing stopping universities and polytechs working together with industry on open source projects to build better modeling platforms (see Scala, Linux, .... ).
There is. They don't have the resources. And Bull's rant is not helping to build bridges.
You may as well ask a Java UI programmer to review security bugs in the Linux kernel.
So why is someone who has only contributed to Ubuntu commenting on software used in data analysis?
This is especially since he seems unfamiliar with the fact that, in the commercial world, organisations that are well-resourced (i.e. those pesky insurance companies) and rely on data analysis do pay attention to all the best practices that he's downplayed.
For one thing, separation of concerns is a thing. Well-resourced workplaces do not have the scientist-user (called a "data scientist") building and maintaining software tools for themselves. They employ "data engineers", professional software developers skilled in coding for data analysis, to do that. They know that data scientists can't "code their way out of an envelope", to use an oft-used exaggeration. Not that data scientists "can't code", but that they don't know the fundamentals of good software engineering, and that can sink a business.
Professional software developers will hate some of these norms because they are bad end-user software engineering
Dear scientist, everyone is an end-user. These norms are just bad software engineering, full stop.
They do have resources, but instead of working cooperatively they decide to work in secret, which is much more costly.
They should get extra resources if they work in the open.
You are right though: Sue Denim and Bull could be playing on the same team. Bull building up terrible arguments in order to prove Sue's point.
The thing is that these norms of software engineering are really not rocket science. Opening the code would have immediately gotten friendly help here and there to improve it.
Henry Story said:
You are right though: Sue Denim and Bull could be playing on the same team. Bull building up terrible arguments in order to prove Sue's point.
Unwittingly, I might add. To recall something you wrote earlier:
One of Nietzsche's aphorisms goes: "The most perfidious way of harming a cause consists of defending it deliberately with faulty arguments."
I would omit the "deliberately" bit. Many people do that because they got annoyed, like Bull.
Yes, Bull's like to go for Red flags.
I don't suppose there would be be any possibility of moving this discussion to 'off topic'?
Nathaniel Virgo said:
I don't suppose there would be be any possibility of moving this discussion to 'off topic'?
It's very on-topic, because this can be an issue for applied CT in the future. Also, this is an area in which applied CT may have the potential to make positive contributions, see some comments by Jules Hedges earlier.
Yes, I look forward to your blog post @Rongmin Lu
I don't have a blog, Henry.
It usually takes just a few minutes to set them up. I am not sure where a good blog hosting place is for mathematicians though. Medium is not good as it does not allow math notation. This is worth a thread on its own here I think.
I think I want to spare myself the headache of maintaining one for now.
Anyone, one final comment. If people can't get their hands on (or their heads around) Bob Martin's Clean Code, at least have a read of this transcript of an interview he did on The Changelog last October. I'll throw in some relevant excerpts:
And it will likely take some kind of event… And that event will be some kind of horrible tragedy. Some poor software guy will do something stupid and kill 10,000 people. And when that occurs, the politicians of the world will not be able to ignore it, so they’ll have to stand up and wag their and point their finger at all the programmers and say “How could you have let this happen?”
But again, it’s our fingers on the keyboard. We are writing that code. How do we answer that question? When the politicians of the world finally stare us in the eye and say “Hey, guys, how could you have let this happen?”, how do we answer that question? Do we say “Well, you know, my boss told me it had to be done on Tuesday.”
The regulation is going to happen. There’s no way to avoid that. In the end, it’s got to happen. The question is whether we get to regulate ourselves or whether we are regulated by someone else. If the answer to the question – they come and they point their finger at us and say “How could you have let this happen?”, and if the answer to that question is “Look, this was an accident. It was terrible, it was horrible, but it was not because we were being negligent… And here’s why. Here are the practices that we follow. Here are the disciplines that we follow. Here are the standards we uphold.” If we can come back with that statement, then we will probably escape the worst of the regulations. They will still regulate us, but maybe they’ll use our own regulations.
Rongmin Lu said:
Nathaniel Virgo said:
I don't suppose there would be be any possibility of moving this discussion to 'off topic'?
It's very on-topic, because this can be an issue for applied CT in the future. Also, this is an area in which applied CT may have the potential to make positive contributions, see some comments by Jules Hedges earlier.
I think it should be moved, it's tangentially on topic but you need quite a lot of imagination to see how. Maybe to #practice: software rather than #general: off-topic?
Jules Hedges said:
Rongmin Lu said:
Nathaniel Virgo said:
I don't suppose there would be be any possibility of moving this discussion to 'off topic'?
It's very on-topic, because this can be an issue for applied CT in the future. Also, this is an area in which applied CT may have the potential to make positive contributions, see some comments by Jules Hedges earlier.
I think it should be moved, it's tangentially on topic but you need quite a lot of imagination to see how. Maybe to #practice: software rather than #general: off-topic?
#practice: software is for discussing software tools for doing CT, so it doesn't seem to be the right stream. I've initially directed Henry to #practice: communication when he posted here, because that stream had topics about peer review and is about scientific communication after all, but most of the discussion ended up here. Maybe #practice: communication?
Jules Hedges said:
it's tangentially on topic but you need quite a lot of imagination to see how.
I think I'll exercise some of that imagination and fill in some of the blanks to try to paint a better picture.
From what I understand of the aims of ACT, one of them is to understand complex systems and processes. To that end, there are several topics here about Petri nets, which you can use to formalise software (engineering) workflows.
Despite the opinions of some in this Zulip chat, good software engineering practices are not some esoteric mumbo-jumbo dreamed up by CS professors to torture undergraduate software engineers and guilt-trip people who code for a living.
They are a distillation of experience won through blood, sweat and tears by software engineers toiling in the industry, knowledge that has been rediscovered many times. They are supposed to optimise the process of building complex software systems. They are supposed to make that process easier in the later stages of that process, with the price of teething pains at the earlier stages of building a piece of software.
Which means they should naturally fall under the ambit under which ACT has set itself. I have already pointed to the work of Bob Martin, who was one of the authors of the Agile manifesto and the author of Clean Code and other books about writing "clean" software, i.e. good software engineering practices.
As I see it, one challenge that people in ACT may set for themselves is to formalise and study these software engineering processes in CT. The things that I've covered in this topic are also useful general knowledge that anyone who's working with code should have, and I presume that many people in ACT are, or aspire to be, working with code.
Hi sorry, I probably shouldn't have spoken up earlier, I just have a big mouth sometimes. It felt a bit off topic when it got really into the details of one particular case, but I agree this general topic is very relevant. Sorry if I disrupted things.
No biggie. Sometimes things get side-tracked for a while and it may be good to have a jolt. In this case, we were having a lighter moment at the time. But I got my message out in the end, it seems, so it's all good.