The sample size of this study is statistically too insignificant to draw the con...

efarrer · on June 21, 2012

I (the author) appreciate the feedback. I believe that many of your criticisms are addressed in the actual paper. First of all I completely agree that my sample size is too small for a conclusive proof. I mention in the paper that I hope that others will try and replicate this experiment on other pieces of software. I do think it's appropriate when conducting an experiment to publish a conclusion, not that the experiment will constitute proof (or an established scientific theory), but as a conclusion to the study that others can try to confirm or refute.

I also mention in the paper that it would be beneficial to conduct this experiment using different type systems for the reasons that you stated above.

The argument against static typing that I was testing didn't mention any particular type system nor any particular dynamically typed language, it was a general argument that stated that unit testing obviated static typing. Because the argument was so general and absolute I felt that any static type system that could be shown to expose bugs that were not caught by unit testing would be enough to refute the argument. I was not trying to prove that any type system would catch bugs not found by any unit tested software. The paper also points out that I'm trying to see whether unit testing obviates static typing in practice, in theory you could implement a poor mans type checker as unit tests, but my experiment was focused on whether in practice unit testing obviates static typing.

Finally I believe that my conclusion in the paper was at least a bit more modest than that of the blog post. The lack of apparent modesty in the blog post was caused more by a lack of ability on my part to accurately summarize than an inflated sense of accomplishment and self importance.

davesims · on June 21, 2012

Thanks for the response! I appreciate the effort you went to here, this was no small task you set yourself to.

I appreciate the clarification. I think now I see better where your emphasis was: the purpose of the paper was to refute an argument, and of course the level of burden of proof is different and far less in that case. I think this misunderstanding on my part is what caused me to call the conclusions 'trivial' -- too strong and dismissive language on my part anyway.

The irony is, you were attempting to do to the unit-testing-is-sufficient argument what I was attempting to do to what I assumed yours was: provide one counter-example to falsify a broad and generalized thesis.

That said, I think I would have liked to have seen your original unit-testing-is-sufficient argument punched up and qualified into something a little more reasonable and real-world. As you stated the argument, it seems like a straw man to me. It seems one could reduce your version of the argument to something like: "Dynamic languages with unit test coverage will always catch errors that statically-typed environments will." And of course this is far too broad and unqualified a statement, and that is precisely why all you needed was one counter-factual to refute it. You didn't even need a handful of Python programs, or 9 or 20 or 100 errors to prove your point. You only needed one, as you stated above. This is why the burden of proof for your thesis was so small, but also why, in my opinion, even with that reduced scope and more modest conclusion, we haven't really learned much.

As someone who has spent most of my career in statically-typed environments and the last 6 years or so mostly in dynamic environments, and also as someone who has made something like the argument you were attempting to refute, I have to say I would definitely never have made such a brittle and unqualified statement as the one you refuted in your paper. To put it more directly, I think I'm probably a poster-child for the kind of developer you were aiming your thesis at, and I don't feel that my perspective was adequately or reasonably represented. More importantly, having looked at the examples given in your paper, I may have learned a bit about the kinds of errors that Haskell can catch automatically that some coders might miss in a dynamic environment, but not much useful to me in my everyday work context.

I think a more reasonable version of the argument, but more qualified and therefore requiring a far larger sampling of code to prove or refute, would be something like: "Programs written in a dynamic language with adequate or near-100 percent unit test coverage, are no more prone to defects than programs written in a statically-typed language with a comparable level of unit test coverage."

I agree this is a very important conversation to have, and again kudos to the work you put in here. Obviously people have strong opinions both directions, and the discussion, however heated at various moments, is an important one, so thanks for this!

ScottBurson · on June 21, 2012

I see both sides of this argument. The OP -- at least in this blog post; I haven't read the paper -- spends most of his time talking about how he's demonstrated the insufficiency of unit testing. For the purpose of that argument, it really doesn't matter that he used Haskell as opposed to some other type checker.

It's only in the last two sentences of his "Conclusion" section that he turns the argument around, and here is where he oversteps:

While unit testing does catch many errors it is difficult to construct unit tests that will detect the kinds of defects that would be programatically detected by static typing. The application of static type checking to many programs written in dynamically typed programming languages would catch many defects that were not detected with unit testing[...]

Clearly, this is overbroad. For starters, he should have used "could" in place of "would". And it wouldn't have been a bad time to remind the reader that Haskell's type system differs from those of other statically typed languages with which the reader may be more familiar.

I don't quite agree, though, that the conclusion is "trivial". Maybe I'm just out of touch, but I wasn't aware of a good test of how true the dynamic argument was in practice, as opposed to theory -- particularly claim #2.

davesims · on June 20, 2012

I think I should clarify what I meant by "the conclusions are nevertheless trivial." Let's look at the key statement in the conclusion of the study:

"Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking."

As I've already pointed out, this seems to me an ambitious and over-reaching conclusion, given the scope of the study.

But, equally important, it is simply an example of something that was already provable. It should be axiomatic that in principle automatically-generated validation like that provided by static typing should in theory be able to catch type errors not caught manually in a dynamic context, either for reasons of human oversight or human error.

In other words, it seems to me that all that has been done here, is to provide a few concrete examples of what was already true and uncontroversial: auto-generated coverage of specific types of validations can be more comprehensive than some human beings will be in some environments and contexts. It has not shown that the perceived benefits of dynamic typing with good unit tests are outweighed by this fact, nor that, statistically-speaking, errors of this type are common enough to warrant a preference of static typing over dynamic typing with unit tests in all contexts.

papsosouid · on June 20, 2012

>In addition, the hidden assumption is that all static and dynamic typing are created equal, i.e., since Haskell is statically typed and Haskell appears to have caught Python bugs that unit tests did not, therefore Java will catch bugs in a Ruby codebase, C++ will catch bugs in a JavaScript codebase, etc.

That assumption isn't hidden, it is made up. By you. The question was "can static typing catch bugs that made it past a decent (and common) test suite". The answer to that can drive interest in static typing, and thus more language with useful static type systems. Just because java has a crappy type system, doesn't mean we should be content with that.

davesims · on June 20, 2012

> That assumption isn't hidden, it is made up. By you.

Not at all. The assumption is clearly implied by the conclusion of the study, which makes an unwarranted equivalence of all languages that have 'static type checking':

"The translation of these four software projects from Python to Haskell proved to be an effective way of measuring the effects of applying static type checking to unit tested software."

> Just because java has a crappy type system, doesn't mean we should be content with that.

I don't know what this means. If the study was meant to comprehend such a broad category as 'static type systems,' and from the explicit language of the study, it clearly was, then absolutely Java must necessarily be included. Otherwise, the study, as I noted, should have restricted its conclusions to a scope of Haskell vs. Python, with at most modest and well-qualified statements regarding the broader implications of static vs. dynamic in general.

papsosouid · on June 20, 2012

>which makes an unwarranted equivalence of all languages that have 'static type checking':

No it doesn't. Read what you quoted, it says nothing even remotely resembling "this benefit applies to all languages with static typing". It is testing static typing, not a specific language. It uses the best static typing system to do so. You are entirely inventing the notion that this must then apply to java.

> If the study was meant to comprehend such a broad category as 'static type systems,' and from the explicit language of the study, it clearly was, then absolutely Java must necessarily be included

No it mustn't. Comparing the best of dynamic vs the best of static is a useful test. Just as nobody is complaining they didn't use a worse language than python, it makes no sense to complain they didn't use a worse language than haskell. You don't draw conclusions about the potential of X by examining the worst example of X possible.

cantankerous · on June 20, 2012

This. Not all statically typed languages are created equal. Java's type system is old and is not state of the art. I wish people would stop using it as a straw man when anybody brings up static typing.

Java was state of the art 20 years ago, but it's definitely not the case any more.

silentbicycle · on June 20, 2012

Java wasn't even state of the art 20 years ago. ML dates back to the 70s.

cantankerous · on June 20, 2012

I agree with you, but I think we might be in the minority.

davesims · on June 20, 2012

Had the study qualified itself to merely "Haskell vs. Python" with deference given to the statistical significance of the sample size, you'd have a point. It wasn't me that brought all static typing, which of course includes Java, into the question at hand -- it was the study itself.

papsosouid · on June 20, 2012

Yes, it was you. Why do you think the comparison should be "really bad static type system" vs "really good dynamic type system"? In what way does that make the test more useful? Allow me to say this again, as I do not know how to be any clearer:

You do not test the potential of something by using the worst possible example of it. The only point of your desire is to reinforce the strawman that java = static typing. A test of "do airbags help prevent deaths" would be a very poor test if it used anything other than the best possible airbag technology.

cantankerous · on June 20, 2012

Since this hasn't been already mentioned, and I run the risk of really flaming things up. Java has a very high propensity of generating runtime type errors. This is easily done by skirting the type checker with casting, which is commonplace. The upshot of me saying this is that I'm actually on the fence of even considering Java to be a statically-typed language for this reason...which is part of why I disagree with the parent even using it as an example of a statically typed language equivalent to the one from this post in a counterexample (also included in this is C, C++, and the rest of that family).

sbmassey · on June 20, 2012

As soon as you start using reflection in Java, you're doing non-statically-typed programming. Since a lot of popular Java frameworks use reflection implicitly - such as Spring, Hibernate, etc - that includes a lot of Java code that's out there.

And also, even if you carefully put a layer of explicit typechecking between the reflection based code, and the statically typed stuff, you're still throwing out the Java generics typechecking since none of that exists at runtime, and so your ArrayList<String> can mysteriously contain non-String types when you finally access it.

scott_s · on June 20, 2012

I don't think davesims is saying that should be the comparison. This particular complaint is about the conclusions, not the methodology. (I recognize he also criticized the methodology.) Conclusions should be useful. People shouldn't have to squint at the wording of your conclusion to determine what that means for them. So, you should bend over backwards in your conclusion, and err on the side of being clear.

With that in mind, I agree with davesims that the conclusion in the blog post is too strong. It is: "The application of static type checking to many programs written in dynamically typed programming languages would catch many defects that were not detected with unit testing" I say it is too strong because the author has not bent over backwards to make clear that this conclusion only applies to the "best" type systems, like Haskell.

For the record, I like the study, and once I run the author's conclusions through my bend-over-backwards-filter, I find them interesting. I upvoted this article. I also upvoted davesims' post because it is academic-reviewer level feedback.

anamax · on June 20, 2012

> You do not test the potential of something by using the worst possible example of it.

So? Folks don't use the "potential", they use the real. They're asking questions like "should I use Java or Python".

> do airbags help prevent deaths" would be a very poor test if it used anything other than the best possible airbag technology.

That's not how things actually work. You decide between what's available. The performance of the best possible airbags is irrelevant. The real question is the cost and benefits of airbags that are likely to be deployed.

tikhonj · on June 20, 2012

And the answer to "should I use Java or Python" is: no! Use Haskell ;). If you're entirely tied to Java (and, in that case, Python would probably not be ideal), you can still use Scala.

The question the study was asking was not "what language should I use for my lowest-common-denominator workforce" but rather "can a static type system catch more errors than unit tests and can statically typed code be as expressive as dynamically typed code".

In other words, it was asking for existential quantification: "does there exist some type system such that..." rather than "forall type systems..." or even "forall average systems...".

papsosouid · on June 20, 2012

>So? Folks don't use the "potential", they use the real.

Haskell is real.

>They're asking questions like "should I use Java or Python".

That's wonderful, but it has nothing to do with the subject at hand, which was the question "can static typing reduce the number of bugs?". If you want an answer to a different question, don't complain about the answer given for this question, go find someone answering the question you want answered.

>That's not how things actually work. You decide between what's available. The performance of the best possible airbags is irrelevant. The real question is the cost and benefits of airbags that are likely to be deployed.

Why can't anyone follow a simple line of reasoning without resorting to fallacies? He tested the best airbags available. Not theoretical airbags that don't exist. He tested a car with the best airbags available to one without. The airbags were a benefit. You and the other guy making up fallacies insist that this isn't a fair comparison, because you want to drive a car where the airbags deploy 5 seconds after impact. Your crappy car isn't relevant to the question of "can airbags save lives".

davesims · on June 20, 2012

>Why can't anyone follow a simple line of reasoning without resorting to fallacies?

Indeed. The conclusion C was out of scope with the premises A and B. C is wrong, but that doesn't mean A and B cannot infer useful, more modest conclusions.

What I don't understand about every one of your responses is that you seem to think false equivalence applies in only one direction.

You seem to think it's fine for OP to infer broad conceptual conclusions from a small subset of the domain, but counter-examples to the broad claims cannot be applied, according to you, because, rather bizarrely you continue to insist that the counter-examples are too specific and and don't apply because the scope is general? That doesn't even make sense.

It's quite simple. OP claims "unit testing is not enough," "you need Static Typing" and uses broad language like "static type systems." I continually insist that such conclusions are out of the scope of the data given: The fact that type-related bugs were found in a handful of relatively small Python programs translated to an idiosyncratic environment like Haskell cannot possibly infer something so broad as what the OP is claiming.

Using Java/C++/Clojure/C#/etc. vs JavaScript/Lisp/Smalltalk/Ruby to give a counter-example is clearly within the scope of the argument. If OP had claimed something like "Python shows risk of static type errors, exposed by Haskell port" and claimed something like "more care and unit-testing is needed to guard against certain types of type-related bugs" I wouldn't have a problem. But that's not what OP claimed.

papsosouid · on June 20, 2012

>I continually insist that such conclusions are out of the scope of the data given:

Yes, clearly you have some serious issues to work through.

anamax · on June 21, 2012

> can static typing reduce the number of bugs

No one claims otherwise. However, that's true of Java's type system too.

> Why can't anyone follow a simple line of reasoning without resorting to fallacies?

I followed your simplistic line of reasoning just fine. It was wrong. Admit that and move on.

Of course you can't, which is how you got there.

The biggest obstacle to Haskell becoming more popular is its advocates.

And, it will never replace Java, C, Python, or even PHP. (One of my professional goals is to never use Java.)

Peaker · on June 21, 2012

> And, it will never replace Java, C, Python, or even PHP.

What do you mean by that?

Many people, including myself, have had Haskell replace Python.

papsosouid · on June 21, 2012

>I followed your simplistic line of reasoning just fine. It was wrong. Admit that and move on.

You are wrong, admit it and move on. Oh gee, does that not actually make a constructive argument?

>The biggest obstacle to Haskell becoming more popular is its advocates.

What does this have to do with anything?

>And, it will never replace Java, C, Python, or even PHP

It already has. You might be too foolish to take advantage of that fact, but how does your foolishness matter to me?

anamax · on June 26, 2012

> >And, it will never replace Java, C, Python, or even PHP

> It already has.

Oh really? Significantly fewer systems are being developed in those languages? How about some evidence?

What? You meant that a couple of applications have been written in Haskell instead of those applications? That's not "replace".

Which reminds me - if I find an application that was written in Haskell that is being replaced by an implementation written in some other language, would you claim that said other language is "replacing" Haskell? If not, don't make the mirror-argument.

scott_s · on June 20, 2012

I believe you are confusing criticisms of the methodology with criticisms of the strength of conclusions.

anamax · on June 20, 2012

> It uses the best static typing system to do so.

It doesn't use the best dynamic language or best unit tests.

papsosouid · on June 20, 2012

Then you should be proposing he use whatever language you feel is better than python at being the best dynamic type system. The best unit tests is entirely irrelevant.

anamax · on June 21, 2012

> Then you should be proposing he use whatever language you feel is better than python at being the best dynamic type system.

Nope.

> The best unit tests is entirely irrelevant

I can find errors in programs with a spell checker. Suppose that those programs have unit tests. Do you really think that spell checker is better than unit tests?

papsosouid · on June 21, 2012

Are you trolling or incapable of reading? Nobody, at any point in time suggested that static typing was an alternative to unit testing. You haven't posted a single constructive thing in this entire thread, and you waited till it was over to do your trolling so you could avoid downvotes. Grow up, or go back to reddit.

davesims · on June 20, 2012

>it says nothing even remotely resembling "this benefit applies to all languages with static typing".

That is precisely what it says, and that is reiterated later:

"...the conclusion can be reached that...in practice [unit testing] is an inadequate replacement for static type checking."

I'm not sure what you're reading, but there's no qualifications in the language used here regarding the idea of 'static type checking,' nothing so modest about the scope of the conclusion as claiming it was merely a "useful test" as you put it. It was a sweeping generalization about two very broad and extremely complex categories of languages. Had the conclusions used more moderate language and qualified itself adequately, I wouldn't have a problem. But all that has been shown here, is that in some contexts more care needs to be taken writing unit tests in a dynamic environment to catch some errors that are automatically caught in static environments. That is all that the data warrants.

CodeMage · on June 20, 2012

> That is precisely what it says

This is a very strong claim and it's false. The article doesn't say that anywhere. You interpret it that way.

I would hazard a guess that presenting your own interpretation as fact is what brought on those downvotes you complain about below.

davesims · on June 20, 2012

Can you show how I've misinterpreted the plain language of the conclusion section?

I'm under the (perhaps mistaken) assumption that in academic papers people tend to mean what they say and choose their language carefully, particularly in the conclusion section.

If the following are not in fact broad, strong claims about the nature of static and dynamic languages in general, then won't you please explain to me how I should interpret them?

Here are the quotes from the conclusion of the paper (emphasis mine):

"The translation of these four software projects from Python to Haskell proved to be an effective way of measuring the effects of applying static type checking to unit tested software."

"Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking."

CodeMage · on June 20, 2012

Honestly, at this point I can no longer tell whether you're misinterpreting or misrepresenting the conclusions. I'll make an honest attempt to argue, nevertheless.

"Static type checking" and "unit testing" are two concepts. There are numerous concrete implementations of these two concepts. The former is implemented in several languages, including C++ and Java and Haskell. The latter is implemented in several frameworks/tools, such as TestNG and PyUnit.

The article concludes that unit testing, as a technique for discovering and/or preventing defects, cannot wholly replace static type checking.

Apart from mentioning the concrete implementations of abstract techniques that the author used, the article does not conclude anything about the benefits of using specific languages, frameworks or tools.

What you have claimed so far is that:

1. there is a "hidden assumption is that all static and dynamic typing are created equal, i.e., since Haskell is statically typed and Haskell appears to have caught Python bugs that unit tests did not, therefore Java will catch bugs in a Ruby codebase, C++ will catch bugs in a JavaScript codebase, etc."

If anyone jumped to this conclusion, it was you. The only thing I can conclude from the article is that static typing checks such as those implemented in Haskell catch bugs that were not caught by unit testing logic such as that used in Python projects within the study. To conclude anything more I would need the data not present in the article, such as exactly what types of errors we caught or missed, etc.

2. the conclusion of the study "makes an unwarranted equivalence of all languages that have 'static type checking'"

It doesn't. The conclusion about the static type checking vs. unit testing might not be backed by enough solid data, but the conclusion makes no claims about languages, beyond specifying which languages were used in the study.

3. the claim that "this benefit applies to all languages with static typing" is "precisely what" the conclusion "says".

No occurrence of any phrase even remotely resembling the quote can be found in the article. Saying "this is precisely what it says" means "you'll find that phrase or one very similar to it in the text". Maybe you were trying to claim that "this is precisely what it means", but it's definitely what it "says".

All in all, the sweeping generalization about the concrete languages was introduced by you. My guess is that this is because you were, like me, frustrated by the vagueness of the article. I would have loved seeing more concrete data. Saying "X types of errors were found" is not as good as saying "the following types of errors were found" and that's just the start.

davesims · on June 20, 2012

"All in all, the sweeping generalization about the concrete languages was introduced by you."

I think the plain, direct language of the paper's conclusion is clear enough without me having to embellish it, and without its defenders extrapolating all of the qualifications and subtexts that they think I missed. You really don't have much to work with, because the paper's clumsy conclusion is small, blunt and unqualified in its scope. It takes a handful of small Python programs translated to an idiosyncratic language like Haskell and concluded:

"in practice [dynamic typing with unit testing] is an inadequate replacement for static type checking."

This is unequivocal language. There's no qualifications about language, context, or any kind of variables that might possibly dilute the strength of the conclusion.

On the other hand, Peter Cooper gives a great example elsewhere on this thread of a much better paper with much broader scope, more stats, and much more modest, qualified conclusions. This is the kind of language that is useful and gives me confidence that the authors didn't start out with an axe to grind and merely followed what metrics they had to the warranted conclusion, no more, no less:

"Even though the experiment seems to suggest that static typing has no positive impact on development time, it must not be forgotten that the experiment has some special conditions: the experiment was a one-developer experiment. Possibly, static typing has a positive impact in larger projects where interfaces need to be shared between developers. Furthermore, it must not be forgotten that previous experiments showed a positive impact of static type systems on development time."

http://www.cs.washington.edu/education/courses/cse590n/10au/...

zopa · on June 21, 2012

> "Saying "X types of errors were found" is not as good as saying "the following types of errors were found" and that's just the start."

The blog post is vague, but the paper (also available at the link) isn't. It identifies the particular errors found.

scott_s · on June 20, 2012

When you present conclusions in an academic paper, the onus is on the author to bend over backwards to prevent the reader from interpreting a stronger conclusion than intended. I think davesims' interpretation is fair given the language, and I were I reviewing the paper, I would have asked the author to temper his conclusions in a similar manner.

davesims · on June 20, 2012

From the downvotes I can only conclude that many of you wish the study didn't claim what it claims and are merely shooting the messenger. If anyone can point out rhetoric within the study that qualifies it in such a way as to make comparisons of other statically typed languages with other dynamically typed languages out-of-bounds or expressing a false equivalence within the scope of the conclusions of the study itself, I'll retract.

But so far all of the arguments I'm seeing against using, for instance, Java, are coming from a perspective not advocated by the study. You all have a point -- it's just not the point made by the paper.

tikhonj · on June 20, 2012

To simplify: there is a difference between "static typing is better than dynamic typing" and "all static typing is always better than all dynamic typing". It's basically the difference between ∃ and ∀.

Saying that "static typing is better than dynamic typing" is like the former: there exists some static typing system that is better than dynamic typing. Saying that "all static type systems are better than any dynamic system" is like the second. All the paper ever says is the first: "Based on these results, the conclusion can be reached that while unit testing can detect some type errors, in practice it is an inadequate replacement for static type checking." Note how it never claims to apply for all possible static type systems; rather, it just says that tests are an inadequate replacement for type systems in general (i.e. there exists some type system that catches more errors than tests). This is exactly like my first example.

In summary: a being better than b does not mean that all a is always better than all b. Just because static typing is better than dynamic typing does not imply that Java is always better than Python; it merely implies that some statically typed language is better than Python.

scott_s · on June 20, 2012

I agree with your characterization in your first paragraph, but I agree with davesims that the conclusions are too strong. If one has to do the level of analysis of the conclusions that you present in your second paragraph, then they are poorly worded. I find davesims' interpretation a reasonable one, which leads me to agree that the conclusions need to be tempered and clarified.

papsosouid · on June 20, 2012

You would do well to consider the very real possibility that it is in fact you who is misguided, and not the rest of the world. You come off sounding childish when you refuse to even consider the possibility that you are simply misinterpreting the purpose and conclusion of the study. The only reason most people can think of to explain your behaviour is that you have an axe to grind and just want to shoot down anything that paints static typing as a positive thing.

tikhonj · on June 20, 2012

When it says "static type checking" it does not mean "all static type checking" but rather "good static type checking". And this is what the study showed (ignoring issues of methodology and sample size for the sake of argument): a (good) static type system would have caught more errors than unit testing, therefore static typing is good.

Generalizing any comment to all static type systems is silly: there are language like C that have a static system but provide basically no additional safety at all. You can easily provide examples of really bad statically typed or dynamically typed languages, but these examples say nothing of static or dynamic typing in general: they're just bad. Questions about static vs dynamic typing can only be answered by the best (or at least good) examples of each.

Showing that a good statically typed system is more robust than a good dynamically typed system is a useful proxy for comparing static typing to dynamic typing. This is similar to a study on seat belts ignoring poor seat belts that strangle the passengers in the event of a crash.

In short: just because static typing is better does not mean all static type systems are better, because you can always come up with a sufficiently bad example of static typing.

papsosouid · on June 20, 2012

>I'm not sure what you're reading, but there's no qualifications in the language used here

That is precisely my point. You are saying "this comparison of coke vs pepsi is no good because they used cold coke, and when I drink warm coke it isn't very good". Yeah, no shit. Stop drinking warm coke. Your decision to drink warm soda has no bearing on the test of cold soda vs cold soda.

davesims · on June 20, 2012

> Yeah, no shit. Stop drinking warm coke.

Fine, then don't claim something like "All cokes in all contexts at all temperatures are better than all pepsis in all contexts at all temperatures."

This is equivalent to what the study does with static vs. dynamic. Your argument, if you actually had a point, would be something along the lines of, "wait I'm talking about this boutique hand-crafted cola (Haskell) I get at Whole Foods, not that old Coke (Java), that's 20 years out of date!"

You're trying to retro-actively reduce the scope of a study you didn't write. The conclusions clearly use generic language that brings all static typed languages into a comparison with all dynamic languages. The false equivalence is not mine! It's the study's. If you want it differently, go write your own study that reduces the scope of the conclusions.

hackinthebochs · on June 20, 2012

I'm gonna have to disagree with you about the conclusion you're drawing. Yes, they are using the generic phrasing of "static typing" vs "dynamic typing", but this is because the study was intended to test the concept of static vs dynamic typing, not particular instances of it. However, seeing as we only have specific instances from which to test, it used the best one currently in widespread use. I don't see this as a problem, nor do I think the wording of their conclusion necessarily implies anything about all instances of static typing currently in use. Sure, it left that open as a possible interpretation for people looking for justification of a preconceived notion, but you can't really blame that on the authors.

davesims · on June 20, 2012

> but this is because the study was intended to test the concept of static vs dynamic typing, not particular instances of it

Help me out here -- since the study confines itself to a handful of small Python programs translated to an idiosyncratic language like Haskell, how can the scope of the study possibly in any way qualify as a study on something so broad as "the concept of static vs. dynamic typing"?

Are you not confusing a better, more appropriate argument you'd make for the argument actually made in the paper?

EDIT: > Sure, it left that open as a possible interpretation for people looking for justification of a preconceived notion, but you can't really blame that on the authors.

Is that really an argument you want to make, that I can't blame an author for using broad and imprecise language that infers unwarranted conclusions in an academic paper?

hackinthebochs · on June 20, 2012

>Help me out here -- since the study confines itself to a handful of small Python programs translated to an idiosyncratic language like Haskell, how can the scope of the study possibly in any way qualify as a study on something so broad as "the concept of static vs. dynamic typing"?

You raise a good objection here. Is it possible to draw conclusions about the class of type systems labelled "static typing" vs dynamic typing by using a small sample of programs? I think this is where the impedence mismatch is occurring. The author seems to take static typing to mean "what can be currently accomplished through static typing", and thus he was justified in using the strongest static type system in use to do the study. Taking it this way, then the study seems meaningful.

Taking the other meaning, the class of type systems labelled static typing, then you end up with a very large set of languages each with (perhaps) varying amounts of power. Doing a study with just one static language does seem inadequate. Although, depending on the class of errors caught, it may still be valid. As far as I've seen, Haskell doesn't catch new classes of errors that are impossible in other systems, it just makes it a lot easier to do so. So essentially Haskell has the same power as other common type systems. If this holds, then the study would still be valid. (Admittedly I know very little about Haskell so I could be completely wrong).

TLDR: I see what you're saying, and I do agree that there needs to be more said before his conclusion can be supported by the study.

jaylevitt · on June 20, 2012

But isn't it problematic that it compared real-world average unit tests with the best-available type system?

Peaker · on June 21, 2012

I don't think so -- anyone is free to choose to use the best-available type system. You can't just choose to write the best possible unit tests.

He could only compare one of the best possible environments for writing dynamically typed code and unit tests to one of the best possible environments for writing statically typed code.

papsosouid · on June 20, 2012

>Fine, then don't claim something like "All cokes in all contexts at all temperatures are better than all pepsis in all contexts at all temperatures."

He didn't. He said "coke tasted better than pepsi". I've explained this to you several times already. You are the only one saying anything about "all the time in every context". You. Not the author, not his paper. You.

davesims · on June 20, 2012

Get me, still waiting over here for a relevant quote from the paper. I've given mine. Where are yours?

jaylevitt · on June 20, 2012

> He said "coke tasted better than pepsi"

I think he actually said "coke tastes better than pepsi". That verb tense has very different implications.

wissler · on June 20, 2012

You make a good point -- I don't think any statistical study will ever be able to show that static typing is better.

I do think that a rational argument can show it, but my argument is too long to fit into the margin.

scott_s · on June 20, 2012

I think that a study over a broad set of applications of considerable complexity could provide enough statistical evidence that most people would be comfortable coming to a conclusion. That study, though, would take a very large effort. Large enough that it may never be done.