Home > Uncategorized > By now I ought to feel more knowledgeable about R

By now I ought to feel more knowledgeable about R

I was surprised to find recently that there are now over 15,000 lines of R code in the book I am working on. If I had written that much code in another ‘newly’ acquired language I would probably feel a lot more knowledgeable about it than I currently feel about R. Why don’t I feel more knowledgeable about R?

Those 15,000 lines are not all real lines, lots of cut-and-paste has been going on; yes, R is a cut-and-paste language just like Cobol and ‘web’ languages. ‘Real’ programmers often look down their noses at such languages, but that is just a failure on their part to understand what they are really all about. Perhaps I have written 5,000 actual lines of R, still a decent amount and half way to the 10,000 line minimum I ask newbies if they have reached.

An expert in a language should be able to pick up a random sample of code and to have been there, done that and got the t-shirt. I still regularly learn new stuff when reading other people’s code, so I’m still a long way from being an R expert. But then R is in the mold of a functional language and one characteristic of languages in this mold is that they provide umpteen different ways of doing the same thing. The combination of this language characteristic along with the lack of common culture in R usage (when this exists it significantly reduces the patterns of code usage commonly encountered) could mean that I am on the treadmill of forever and regularly learning new R coding techniques (which is great source for blog articles but gets tedious after a while); Perl is a lot like this.

As a compiler guy I’m used to learning a language by reading the language definition. Reading this document gives me a warm fuzzy feeling of knowing the language, this has nothing to do with being able to program in it and there is no way of knowing that I understood what the words meant. I was going to say that the R language definition was little more than some brief notes jotted down by somebody to be written up later, but checking the link to the page I discovered that somebody had been spending time significantly improving on what existed a few years ago; there is still a way to go but the R language definition is starting to look respectable. Hopefully my feeling of R knowledgeability will improve after I have read through this updated definition a few times.

Use of R is usually intimately bound up with the data being manipulated; on a per line of code basis much more so than other languages (in this regard it is like Cobol). Perhaps the need to have to learn lots more about the data than I normally have to adds to my feeling of not knowing. Would my feeling of knowledgeability increase if I worked with the same kind of data ll the time?

Categories: Uncategorized Tags: , , ,
  1. Bernhard
    March 19, 2014 07:18 | #1

    Could you please elaborate on why you think of R as a ‘copy&paste-language’? Because I really don’t see it that way. R has functions, lambda-expressions and all types of loops including the apply-family and a very nice package-system and little boilerplate so there really doesn’t seem much reason to copy and paste a lot.
    Maybe it’s true, that functional languages offer more ways to do something (not sure yet) but they also have a reputation of allowing very short code to do a lot.
    I can understand that some languages give a “warm fuzzy feeling” and that is a very subjective thing. I can tell from the cold shivers that Java gave me, every time I tried to learn it. I don’t think I can really put the finger into the wound why Java gives me these feelings.

    Having to think about data when doing data analysis is nothing you can hold against a programming language. Of course, in some other languages you have to think more about boilerplate code before you can actually tackle the data. If that is what makes the “warm fuzzy feeling” then I don’t want that.

    Thanks for the blog,
    Bernhard

  2. March 20, 2014 00:32 | #2

    @Bernhard
    A cur&paste language is one where a non-trivial number of new programs are created by copying chunks of code from existing programs. A common reason for doing this is that the data being manipulated is very similar in both programs. Why not just extend the existing program? It is often simpler and quicker to have separate programs (academics witter on about clones being bad for maintainability, but have not noticed that a lot of code is not maintained), this allows them to evolve in their own way.

  1. No trackbacks yet.