James Gosling, often referred to as "Dr. Java", is a Canadian computer scientist, best known as the father of the Java programming language.
Introduction
James Gosling, often referred to as "Dr. Java", is a Canadian computer scientist, best known as the father of the Java programming language. He did the original design of Java and implemented its original compiler and virtual machine. Our DevRel, Grigory Petrov, had the opportunity to interview James, and we have included the entire transcript below. Hope you enjoy it!
The Interview
Grigory: As software developers and software consultants, we're trying to organize a community in Russia: Python, Ruby, Java, and Go communities. And we want to help our fellow developers by conducting interviews that highlight essential questions for our industry. I think that your experience and your work on Java can help developers to become better. So let's try to help them!
Some languages, like Go, leave out classes and inheritance, while others experiment with features like traits in Rust. As a language designer, what do you think is a modern, general-purpose, reasonable way for a programming language to do composition?
James: I don't think I wouldn't do classes. I actually find that classes work pretty well for composition. I don't really have any good, clear ideas for what to do differently. And some of the things that I would do differently are a little strange. In C, there are macros, which are pretty much a disaster because the macros are not part of the language; they're kind of outside of it. The folks at Rust tried to do a decent job of fitting macros in the language.
Other languages, like all of the Lisp family, managed to fit them in more gracefully, but they had a way of defining syntax where the syntax was almost entirely free of semantics. And in most languages, syntax and semantics kind of go hand in hand. As somebody who has written a lot of Lisps in a past life, I am really addicted to the technique of using Lisp programs to manipulate Lisp programs. That's one thing that I really, really miss. Some languages let you do that in different ways, so like in Groovy, you can directly play with the AST. And Rust has some sort of syntactically integrated macros. But it's always felt to me like there's an interesting research question in there: can you do more?
Can I get the feel of Lisp doing computations on code fragments to generate new code? And in the Java world, people do that. It's one of the more popular features, except that it's really low level. Because people use a combination of annotations and the fact that you can generate bytecodes with some of the different languages. That is super powerful. It gets used in places like you wouldn't expect, like in Jackson. It gets a lot of its performance by computing the serializer.
On one hand, it's a very powerful technique. On the other hand, it's just super hard to use. The fact that it's possible is great. But how far can you go? They can be kind of limited. So if you look at something like Lombok, it's one of the things that I find to be... well, I have a strong love-hate for it. Because it adds a bunch of Java features that are pretty nice, but on the other hand, it shows weakness. Partly in the process, as this is a set of features that should be just built-in. And the Java Community Process has become somewhat less community than it should be. I'm on the outside these days and have been for quite a few years, but there are things you could do that are just all over the map.
Grigory: That's why we prepared questions about your fantastic experience with creating languages, and not some modern Java enhancement proposal. Five years ago, I can confess, I manipulated some Java bytecode. For good, of course, but creating domain-specific languages out of it is kind of tricky. With Ruby, it's much easier. We at Evrone are proficient with Ruby, we have dozens of Ruby developers. And Ruby developers are great, but they require lots and lots of years of training to learn all the DSL magic.
James: One of the things with features like computing code fragments, the reason why it's awkward in Java, is that Java tries to go all the way to compiled machine code. And Ruby is pretty much always interpreted. When you're doing that, when you're not trying to get all the performance that you can, then life is easy. But if you're trying to get both powerful features and ultimate performance, life becomes much harder.
Grigory: Recently, we interviewed Yukihiro Matsumoto, the author of Ruby, and he mentioned that he had performed an experiment with his latest major Ruby 3.0 version. He tried to release this version without breaking changes to see what would happen. A major language version that doesn't break anything. I know that Java is cautious about not breaking things. Is it a good idea for all the languages to evolve without incompatibilities? Or is it a limited approach that can be used only for specific languages, like Ruby or Java?
James: It is almost entirely a function of the size of the developer community. Every breaking change induces pain in the developer community. If you don't have many developers, then breaking changes aren't a big problem. And you also have to think about the cost-benefit tradeoff. If you do a breaking change, it adds some pain, but it also brings some benefit. For example, if you change the subscript operator from square brackets to round brackets, it probably buys you absolutely nothing and induces terrific pain. That would be a dumb idea.
In JDK 9, there was a change, one of the very few breaking changes that were ever introduced, and what it broke was: if you're using some of the supposedly hidden APIs, the encapsulation mechanism gets scrambled, and people who were breaking encapsulation boundaries and using things that shouldn't be used in ways that shouldn't be used, they had some pain moving from 8 to 9. But once we get beyond that, it allows the platform a lot more freedom to innovate. And in this particular case of 8 to 9 transition, it means that the platform can be sliced and diced, and you can actually make custom packaging so that the Java Runtime Environment will be smaller.
One other area where there is always a fair amount of discomfort is: when there's a bug in something, and people do workarounds for the bug, if you fix the bug, you may break the workarounds. There have certainly been instances in the Java world where we decided either not to fix bugs or introduced a method that does the correct thing. That even shows up in hardware. There's an issue with sin and cos, they were slightly incorrect, so you have to have correct and incorrect instructions.
Grigory: Twenty-five years ago, when I started my own career as a software developer, I wrote a lot of C and C++ code. And I remember these mysterious pointer bugs that happened once a month. Debugging such bugs was a pain. But now, as a software developer, I see lots of tools integrated into our workflow, like static type checkers. Modern developers use IDEs, like NetBeans, IntelliJ IDEA, or even Visual Studio Code. They write the source code, and a static type checker parses the program, constructs an abstract syntax tree, and checks everything it can. And then possible errors are highlighted right within a text editor. And such tricks are available, not only for statically-typed languages, but even dynamically-typed languages, like Python, Ruby, and TypeScript. What is your opinion on these static type checkers we use today? Are they a step forward to writing better software, or do we need to put more inside the language syntax?
James: Well, both. I'm a big fan of languages with static type systems because they provide a scaffolding for the static type checkers and IDEs to work. I spent most of my life as a software engineer, and the least satisfying way for me to spend my time is hunting down obscure bugs that happen at weird times. And anything I can do to make bugs disappear before they waste my time is a good thing. So, I'm a big fan of just about anything that IDE can do to reduce the probability of a bug. So when we look at the dynamically-typed languages like JavaScript and Python, they have less of an inference framework to work that out because they don't necessarily know what the type of anything is; they’re just kind of guessing. Strongly-typed languages, like Java, provide a much more rigorous framework for the type checkers to use. And, going up another level, there are things that do full auto theorem proving. So there are systems like Dafny, which has a really sophisticated theorem prover. So if you want to build an encryption algorithm, you will be able to mathematically prove properties. You can do that. That may be a little too far, but for some code, it's really useful.
And a lot depends on what your goal really is. If you're a university student and you're trying to get your assignment done, or you're a Ph.D. student, and you're trying to graduate, then when you write a program, your goal is that it should work once. At least once. Because you have to do a demo and be able to show it off to see if it works. If you're in an industrial setting, where I have been most of my life, working once is only slightly useful. It has to work every time. And the difference between working once and working every time is huge. And so, if it only needs to work once, then the more dynamic languages work reasonably well. If you have to be sure that it's gonna work over and over again, all of the static typing tools help you come to that confidence. But if the kind of thing you're doing is... say, you're a physicist, and you want to find out the result of some computation, it only needs to run once. It depends on the context of the work you're doing. And the more reliability you need out of the software, the more statically-typed language helps.
Grigory: Talking about enterprise and industrial development. I never programmed robots myself, but I spent time working for companies that create software for millions of people, and I can compare today and 20-25 years ago. And I see that right now, social coding platforms, like GitHub, are backed by big companies, and they help with open source development for both individual developers and enterprise or industrial software developers. So can we call today the Golden Age of open source software, or is it not so clear? What do you think about it?
James: I have no idea. You're asking a question about the future. And the problem with the question, “Is today the Golden Age"... that question implicitly says: "Is it downhill from here?" If this is the Golden Age, then tomorrow will be not-so-golden. And I think we’re leading up to it, whatever the Golden Age is. I think there are a lot of interesting improvements that can happen. Currently, we have all kinds of crises around security and how people can do cyberterrorism. And when that kind of stuff is going on, I don't think it is the Golden Age. If there's some way the collaboration of communities of people can lead to the end of cyberterrorism - that would be pretty golden. We’ll see. I mean, this is a really great time, but it could be better.
Grigory: You created Java and JVM (Java Virtual Machine) with JIT (just-in-time compilation). JIT provides really amazing speeds while keeping language syntax pleasant and high-level. Many languages followed your lead, like C# and JavaScript. The speed of compiled and recompiled code over hot paths is close to C and C++. But many other languages, Python, Ruby, PHP, have optional JIT that is not so popular. And many mainstream languages don't use JIT to get that huge speed increase. Why don't all languages use JIT to provide fantastic speeds for software developers?
James: To really get the performance improvements you see, it helps dramatically to have a statically-typed language. For dynamically-typed languages, like Python, it's really, really hard. And often, what people end up doing is adding annotations to the language so that you get languages like TypeScript, which is essentially JavaScript with type annotations. And it's really funny because JavaScript is essentially Java with the type declarations removed. So TypeScript is essentially Java with permuted syntax. They've got sort of Pascal-style declarations. But if you're somebody who's just slapping together quick scripts in Python, a lot of people in that world find declarations to be annoying. Thinking about the types of their variables is annoying.
In Python and many others, in general, there's only one kind of number, and that's the double-precision floating-point. There are no true integers, there are no bytes and 16-bit integers and things like that which conceptually add complexity, but they also improve performance. If you got a double-precision floating-point versus a single-precision floating-point, there's a cognitive burden in that. To make intelligent tradeoffs, you have to understand a bit of numerical analysis. And it is reasonably common for people who are software engineers to know almost nothing about numerical analysis. And so they would just rather not think about it. And if you're a physicist using Python, you probably want all the precision you can get almost always. Unless, of course, you need to put a really big array in memory, where the difference between single-precision and double-precision or an 8-bit integer really matters. If you're living in a space where none of these things are really of any consequence, it's just easier for folks.
But if you need to care... I've taken too many numerical analysis courses in my lifetime and been burned by shoddy numerical analysis enough times that I tend to care. It depends on where you are on the spectrum, and most people in the scripting language world don't care about that sort of problem. Their concerns are at a very different level. And a lot of people don't actually care about performance and numbers in detail; they care about: "Is it fast enough?" Performance is kind of a boolean: it's fast enough, or it's not fast enough. For some people, it's more like tuning a race car. If you can get an extra two or three miles per hour out of a car, then you're more likely to win the race.
Grigory: I remember a few months ago, David Heinemeier Hansson, an author of Ruby on Rails (one of the widely popular web frameworks), mentioned that only 15% of his cloud budget goes towards language itself. The rest are some caches, message queues, storage, and so on. He told us that no matter how "slow" Ruby is, that it isn't very important, because even if Ruby was 100 times faster and 15% became 1%, that doesn't change a lot. Modern languages are, indeed, "fast enough."
James: That depends a lot on where in the space of programs your task is. If the thing you're trying to accomplish is really dominated by networking and databases and all the rest of that, if you're doing RPCs all the time, probably the first thing you should do is question whether or not all of those RPCs are valuable. When people talk about microservices, they're a fine thing, but just understand that they're at least a factor of a million slower than a method call. Think through the implications of that. Usually, for most people, they'll get a lot more performance out of making sure that their large-scale architecture is clean. But there are a lot of folks for whom all the low-level details really do matter. If you know being highly concurrent is important, being able to drive thousands of processes at once, doing major computing... if you're doing something like a database itself or a major storage service, you really, really care. So it all depends on the task at hand.
Grigory: Recently, we saw many languages embrace coroutines and an async/await approach to handle things like the network, which is slow. It was added to Python, it was added in the recent Ruby, into JavaScript, into many languages. But this async/await and coroutines and scheduler in one thread are not silver bullets. They come with their own complications, and sometimes they can make software slower. So what do you think about this modern async/await hype? Is it a good way to handle the network, or do we just misuse it, and we need to check Erlang and other approaches?
James: This is one of the sorts of things where context is everything. Coroutines are perfectly fine; they've been around since the ‘60s. The first language with coroutines was Simula 67. Simula was a lovely language. I still miss it. It didn't have threads, it had coroutines, but the way they did coroutines - they looked a lot like threads. Coroutines kind of magically sidestep some of the naughty issues in true parallelism. And for me, one of the problems with coroutines, which is why I haven't used them in a long time, is that they don't actually let you do or let you take advantage of multiple processors. You can't do true parallelism.
So people look at the things in languages that have true parallelism, like Erlang and Java. The things that you have to do add another level of complexity. Although, often, the way you deal with that complexity is by having very carefully curated primitives. The things that you can do with ConcurrentHashMap in Java are just magical. But as soon as you've got one of these coroutine-based languages and you try to exploit multiple processors, if you're doing a lot of coroutine-type operations and you don't have enough processors, you're just saturating one processor. And you would really like to be using multiple processors because there are no unit processors in the world anymore, right? Everything's got a lot of cores, and if you really want to use all your computer at once, on one problem, you just have to fight and handle the complexity inherent in true multi-threading.
Then, there's the issue of style. Imagine a context where you can say "await this" and "await that" where they do this transparent inversion of control where you're passively yielding. That gives you a syntax appearance that looks a lot like true threading. But it means that a lot of the tricky bits in true threading you get to avoid. So if you say “a = a + 1”, you know that in the middle of that operation you're not going to get interrupted, so you don't have to do synchronization. But then there are other places where instead of doing that kind of style, it becomes an event-directed style, where you do your thing, and then you plug an event handler into something to handle what happens when things are complete. And that tends to be the predominant style in JavaScript. That works perfectly well, but it can get kind of clunky.
When I discovered Simula back in the early 70s, it had a sort of natural style. You just program, and you can think of your computation as a self-contained thing. And whether or not other things interleave with it is transparent to you. I found it, as a conceptual model, to be much cleaner than event programming. It's tougher to implement under the covers, but it's usually easier to think about.
Grigory: Simula was the first object-oriented language, after all! I never had the chance to work with it, but I checked the documentation, and it looks featureful. However, if we review some modern languages like Ruby, the concurrency model is complex: we have processes, individual interpreters within processes, threads within individual interpreters, and core routines within threads - like a Russian doll. Now a non-technical question, if you allow. When we talk about different languages, in your personal opinion, what is the best language to teach new software developers right now, as their first language? Maybe in grad school or university.
James: I'm clearly biased. Java has been used really successfully that way for quite a long time. But the first programming language I learned was PDP-8 assembly code and followed roughly concurrently with Fortran. You can teach people just about anything. And it'll get through to some of them a lot easier than others, but a lot depends on what the eventual career path of a person is going to be. If you're going to be a full-up software developer where you're building sort of big, high-performance systems, it's hard to beat any of the languages that run on the JVM. And I actually don't care which language you use on the JVM. I mean, Scala and Kotlin are both fine. Clojure is really entertaining, but you have to really think differently. If you're a physics student, Python is fine.
And I don't think it's actually that big of a deal which one you choose. Although a lot of people just stick with the first thing they learned and do that, if you can get people to learn multiple languages and go back and forth... a really nice course that every university should be operating for every student is a comparative programming languages course. In the semester, you have five assignments in five different program languages, which gets people used to learning them quickly, because they really are not all that different, and gets them to think about which ones are better. I took one of these courses a very long time ago, and I used the absolute worst language for every assignment. Doing numerical computing in Cobol. That was just entertaining! And symbolic manipulation in Fortran... surprisingly, I still got an A.
Grigory: As expected. So, the next question is about pattern matching. Recently, it gloriously landed in Python and Ruby, and lots of proposals are available in different languages. We checked the developer whitepapers, and they are not entirely sure about the role of pattern matching in a modern, high-level language. This pattern matching idea, how do you think it fits that toolbelt for ordinary, modern developers who work with Java or Python or Ruby or some high-level language? Do we really need pattern matching, or is it some niche syntax for very specific use cases?
James: For starters, I think that the term "pattern matching" in programming languages is somewhat misleading. Because when I hear the phrase “pattern matching”, the thing that springs to my mind is regular expressions, whether regular expressions over character strings or regular expressions over trees. Maybe pattern matches over tree shapes, whatever. But going back to Simula. Simula had an inspect statement, and the inspect statement was almost exactly what many of these pattern matching statements are. Namely, an inspect statement is a case statement where the case labels are type names, so you'd say:
So you can think of it as a case statement that cases on the type. And most of these pattern matching language proposals are more of that kind of thing. And personally, I miss that. I really like that. Especially if what happens is like an implicit cast in C. So if you say “inspect P When Image P do P”, P, within the body of the case statement, is now the type of the switch label. That makes life so much easier. There are all these places where, in a language with a C-like syntax, you end up with casts all the time. It looks like: "if a is an instance of x, else if a is an instance of y, and then ..." And the "inspect" statement in Simula was just beautiful; I loved it. Since then, I have missed it every day, and many of these pattern-matching proposals and language features look like that. If you call them something like a "type case" - great idea. But if you call it "pattern matching," and it has less power than a regular expression, it feels misleading or like false advertising. But, as just a feature, I think it's great.
Grigory: Our last question is a bit obligatory. Russian software developers are proud of JetBrains and Kotlin development. Of course, I will not ask some trivial thing like Java vs. Kotlin and so on. I will try to ask something different. Kotlin and many other languages, like Clojure or Scala, thrive on the existing Java Virtual Machine that you created and the existing ecosystem of libraries, frameworks, and existing code. Is there any challenge all such languages face? Is there something that unites them? Some difficulty for them? When they're trying to hot-swap Java syntax with some different syntax, what challenges do they face?
James: Kind of depends on what you're trying to do. One of the peculiarities of the Java Virtual Machine is that there are many notions of security and reliability built into it. And they have to do mostly with the integrity of the memory model. Pointers and things like that. So you cannot forge a pointer. If you look at languages like C, if you don't have the ability to forge, you can't do C. There are some virtual machines out there where they don't have a tight security model. On something like the JVM, if you try to implement C - and some people have done it, although it's odd - there are places you just can't go if you've got a rigorously secure virtual machine. But some folks have built virtual machines that are not rigorously secure, that don't have a memory allocation model. If you want to do interoperability between C and Kotlin, you have to be willing to give up a certain amount of security and reliability.
And so it depends on where you're willing to go. And certainly, at the dawn of Java, one of my personal rules was: I do not want to have to debug another freaking memory corruption bug. I had wasted way too much of my life on obscure memory corruption bugs that take you days. And it's just an off-by-one error in a loop that just happened to go one entry off the end of an array, and you don't find out until millions of instructions later. And I really, really hate chasing memory corruption bugs. So it depends on what you're comfortable with. Some people, you know, think spending time doing that is very manly. But there are also people who like to use vi, which was a great editor in the ‘70s and a good editor in the ‘80s... Come on, folks!
Grigory: The memory security model is indeed in the core, and it provides something but limits something else. Thank you a lot, James! It was a pleasure to have this conversation with you, and I hope that after all this zombie apocalypse, we will meet each other in person at some offline conference. Thank you and have a nice day!
The Conclusion
We are so grateful for the opportunity to talk with James and get his insight into the languages, features, and solutions that we use every day.
Also, we want to express gratitude to our colleague, Oleg Chirukhin from JetBrains, for assisting with the text version of the interview.