阳光海岸线: 2014

Academic Careers vs. Industry Careers

Greg Duncan (right) and Guy Lebanon (left) at Amazon's Q&A Session on Academic Careers vs Industry Careers

The following is an edited transcription of a Q&A session titled "Academic Careers vs. Industry Careers" given by Greg Duncan and Guy Lebanon to summer interns at Amazon in the summer of 2014. Some of the content is specifically aimed at the fields of machine learning, statistics, and economics. The content does not reflect the official perspective of Amazon in any way.

Q: What are your backgrounds?

Greg: I am the Chief Economist and Statistician in GIP. I do a variety of things and depending on what day you are talking to me I'm one of an economist, a statistician, a machine learner or a data scientist. I have a PhD in economics, an MS in statistics, and a BA in economics and English. I work on a variety of statistical and forecasting issues in our Global Inventory Planning process, and I do data science consulting generally within Amazon. I am responsible for the algorithm that powers the buy box on Amazon's web site.

I have 40 years of experience in data science, economics, and statistics. I was a full professor in both economics and statistics and had faculty appointments at Northwestern (6 years), UC Berkeley (12 years), WSU (9 years), Cal Tech (2 years), and USC (3 years). Currently, I have a part time appointment at UW's economics department in addition to my full time position at Amazon. I've had joint positions between industry and academia for most of my career. I've supervised more than 20 PhD dissertations and 18 of them are full professors, some of them in good places: CMU, Maryland, UCSD, and the University of Wisconsin. I have an early 1980 paper that lays out map reduce and mini-batch SGD (though using other names).

I quit publishing in 1989 since I was at Verizon and GTE labs and they didn't allow publishing. I was the chief scientist at GTE lab and I led the team that designed the spectrum auction algorithm for the Verizon wireless network. I also designed and fielded hundreds of consumer preference surveys. I then transitioned to consulting and worked at National Economics Research Associates (NERA; a Marsh McLennan firm) as senior VP and management committee member. Later I was Managing Director at the consulting firms Huron and Deloitte, and a Principal at the Brattle consulting group. I then retired, but about two years ago I came out of retirement to join Amazon.

Guy: I grew up and went to college in Israel. I came to the US in 2000 for a PhD at Carnegie Mellon University. I graduated in 2005 with a thesis in the area of machine learning. It was clear to me since my sophomore year in college (1997) that I wanted to be a professor. I've worked towards that goal in a very focused way from 1997 to my PhD graduation in 2005. I was lucky to get a faculty job right after graduation and I joined Purdue as an assistant professor in 2005. Due to a two-body situation and geographical constraints I decided to move in 2008 and I ended up at at Georgia Tech. I got tenure and promotion to associate professor at 2012 and then went on a one year sabbatical at Google. At the end of my sabbatical my wife and I decided to stay on the west coast and I decided to stay in industry, but move to Amazon rather than stay at Google.

Q: What are your main messages about careers in industry vs. careers in academia?

Greg: Here are my main points:

(a) There is a lot more freedom in academia than in industry, but the problems are not as interesting.

(b) The stuff you do in industry ends up actually being used.

Guy: First, I would like to point out that there is no objective right choice here. There are pros and cons and the right answer depends on the person, and possibly even the specific time in a person's life. For example, for me the right choice in 2005 was to join academia, but in 2013 it was to move to industry.

The main pros of academic jobs are:

(a) Freedom. Despite what some recent blogs say about professors not being completely free due to their need to publish and get grants, there is much more freedom in academia than in industry. This is true for young faculty members, but it is especially true for senior faculty members. They do not report to a manager in the same sense that they would in industry. As long as they remain somewhat active in publishing and grants, they can work on whatever they want.

(b) Theory. Academic jobs are much more suited than industry jobs for people who want to work on theory. This is true for pure mathematicians and for CS theorists, but it is also true for people who mix theory and algorithms.

(c) Teaching. Not everyone likes to teach, but if you do then this is a huge pro. I like teaching and I felt a lot of satisfaction from teaching students of all levels.

(d) Stability (see below).

The main pros of industry jobs are:

(a) Big Impact. A few people can accomplish something truly big and important on their own. Albert Einstein comes to mind. The other 99% need a lot of help, especially in computer science. The help includes a team, proprietary data, and massive computing resources that are all generally unavailable in academia. I want to clarify that I'm talking about really big accomplishments that revolutionize our world, for example, the Internet, cloud computing, smart-phones, and e-commerce. These revolutions all required a lot of people working together and a lot of financial support. In academia you can develop some theory that will influence these revolutions to some degree but it is very rare to actually be the main driver of these revolutions.

(b) Teamwork. Academic work is pretty solitary. Professors meet periodically with their students and colleagues, but it doesn't come close to the teamwork that you find in industry. Some people prefer to work alone, and that is fine (notice my point above though that real impact requires teamwork). But for people that enjoy being a part of a team or leading a team this aspect of industry can be a big pro.

(c) Two Body Careers and Geography. In the high-tech industry it is fairly easy to find a good job in many regions. For example, you can find many good opportunities in the US at San Francisco, the bay area, Seattle, New York City, Los Angeles, Boston, and Chicago. Some of these locations have more opportunities than others, but a good candidate can find good options in any of the above locations, and many other locations as well. In academia, a typical applicant sends 10–50 applications all over the country, and if they are lucky they get one or two offers — likely in places that are not their top choices. This is a problem for the professor, but it is an even bigger problem for the professor's spouse, who is likely to have a separate geographic preferences.

(d) Compensation. Compensation is much better in industry than academia. The compensation gap is minimal right after PhD graduation, but it widens significantly with time. It depends on many factors, but is quite possible for the pay gap to grow to more than 100% after 10 years (possibly significantly more than 100%). Many young PhD graduates don't consider this a priority right away, but this becomes a bigger priority later on when the graduates plan on having kids, buying a house, etc.

(e) Opportunities (see next paragraph).

Workplace and role stability is much higher at academia and it can be both a pro and a con. It is a pro since it is nice to not worry about what happens next and being able to make long term plans. It is a con since stability is negatively correlated with new opportunities. In industry people frequently leave and roles change and this can often lead to exciting new opportunities. For example, my immediate manager at Amazon left a few months after I started. This caused some unexpected issues for me, but things turned out very well at the end — I got an opportunity to interact directly with Amazon's leadership team. Industry is full of opportunities, but one has to be ready for some amount of instability.

There are two stories that I want to mention next that are relevant for this discussion.

Story 1: As a professor and graduate student, I noticed that professors often "brainwash" graduate students that academic jobs are the best path forward for the best students, and that industry jobs should only be entertained if a faculty search is unsuccessful (or is unlikely to be successful). Professors convey this message to their students since they believe it to be true (after all they ended up being professors because that is what they believe). But there is an added incentive for professors to do that: placing your graduate students as faculty in top departments increases your reputation and your department's reputation and ranking. Indeed, the placement of graduate students as professors in good schools is a key component in how professors and departments are evaluated (both formally and informally). I personally witnessed department chairs and deans urging professors to push their top students to faculty jobs rather than industry.

While at school, graduate students receive this message from their thesis advisors and other professors they look up to. I want to offer here a different perspective and hopefully help people understand the relative pros and cons so they can make the best decision for themselves.

Story 2: A couple of years ago after getting tenure at Georgia Tech, I had a minor midlife crisis. I could choose to proceed in many different paths as a tenured professor, but I didn't know what to choose. I could focus on teaching or writing a book; I could keep writing research papers in my current area, or switch to a different research area. I started writing a book. I studied for some time new research fields (computational finance and renewable energy) — thinking that I may want to change my research direction. I eventually understood that the main question I need to answer is: "what do I want to do for the rest of my life?" This question is highly related to the question "looking back from retirement, what accomplishments will I consider to have been successful?"

The key insight for me was: publishing papers, having others cite my work, and getting awards is simply not good enough when the alternative is improving the world. In the future, when my kids ask me what I did during my long work days I want to point to more than a stack of papers, decent pay, and a length CV. I want to say that I played a big role in helping society and changing the world for the better. The chance of doing that in industry is much bigger than in academia. We are living in a unique age of increasingly fast revolutions: computers, Internet, e-commerce, smartphones, social networks, e-books. These recent revolutions were all driven by companies like Amazon, Apple, Google, Facebook, and Microsoft. We'll see many more revolutions in our lifetime including wearable devices, self-driving cars, online education, revolutions in medicine, and more. I do not mean to say that I can change the world by myself in a couple of years, but with the help of colleagues and over the period of several years it is possible.

Greg: I found most of my research topics in the nexus of computer science, economics, statistics, applied math, and psychology. At Amazon these areas are like tectonic plates hitting one another. Economists can do some things, statisticians can do other things, and data scientists can do other things. If you are in academia, it is not easy to work in that nexus. Many colleagues in the statistics department would say that is nice work but it is not really statistics stuff. The same thing would happen in computer science and economics departments. I find working in that nexus to be very exciting. It isn't clear what to do, but you realize that there is a problem and no one knows how to work on it. Another related issue is that in industry you need to solve a specific business problem, not a related problem as is often done in academia. It's a kind of freedom, but it is also a restriction. It's like writing a sonnet — you can do anything you want as long as you maintain its form.

I really like working at Amazon. Another place I had a lot of fun was Verizon where I had a lot of impact. When I see people buying stuff from Amazon I can point and say I helped make this work well. Similarity, when I see people using Verizon phones I can say I helped develop that technology.

In my area of economics and statistics you probably want to go to academia first, have your midlife crisis, and then move to industry. My mid life crisis, by the way, was having three kids preparing to go to college and realizing that even a senior tenured professor at a good university can't send them to college and can't buy a house. Perhaps an industry postdoc position would work well, in that people don't go straight to a faculty job but would have some industry experience first. Unfortunately, that does not happen frequently in economics or statistics.

Q: What limitation does an industry job places on future jobs. What if you choose industry first and then have a mid life crisis and want to move to academia?

Greg: In economics or statistics that would be a killer. However, I think in a growing field like data science perhaps you can switch to academia after industry. I think it is important to keep a connection with universities even when you are in industry. I have done that for much of my career. In fact, I recently taught a course on data science and I put the syllabus on the web. I got many requests from academics asking me to come and teach for their departments since they are trying to start an ML group.

Guy: This would not be a big deal if you are going to industry for a short amount of time (1–3 years) and then apply for faculty jobs. If you are in industry longer and want to keep the option of moving to academia you need to find a place that allows and even encourages publishing. You don't have to publish like crazy but you need to keep publishing. I also think that sometimes the most interesting papers come from industry since they work at a scale that academics don't have access to. Such papers are worth more than typical academic papers experimenting on standard public datasets and will be very valuable if you later apply for faculty jobs.

In machine learning, cutting edge is increasingly done in industry since academics don't have access to the really interesting data, evaluation methods, and computational resources. It is already impossible for academics to reproduce large-scale ML experiments described in industry papers and build upon them and this will become more common in the future. Once this happens more frequently, algorithmic and applied ML innovation will occur primarily in industry.

Greg: I want to amplify that. I get a lot of requests from former colleagues of mine that ask me to give them interesting Amazon data so they can work on it. The answer is usually no and not just at Amazon — Google and other companies would usually say the same thing. In industry you have access to datasets that academics can only dream of. You will be picking up stuff that is quite relevant and the market values that human capital significantly. Sometime universities hire industry veterans — for example USC does that. These veterans may not have PhD degrees even, but they have experience that is extremely valuable and generally unavailable in academia.

Guy: I want to mention a related issue. In academia the mindset is usually you join a department and you stay there until retirement. Perhaps you move from one university to another once in the middle of your career. In high-tech industry the mentality is very different these days. Many people join a company and then move after a few years, and then move again. Transitioning form one company to another is especially easy if you live in a high-tech hub like the bay area, Seattle, or New York City, but it happens in other places as well. As a result, young graduates should not feel that choosing between different industry options is a huge decision that will design their entire career without a possibility to backtrack from it.

Greg: That is a good point. In academia I can't tell you the number of times someone made me an offer and I said no I'm not interested, and then something changed and I called back after two years expressing interest and they said sorry — we can't reissue that offer at this point. That absolutely doesn't happen in industry. It's almost like fishing — we didn't get you this year but we'll try again next year. Maybe you will be upset for some reason and we'll be able to get later. The key thing is we are in a growing and expanding industry and as a consequence experienced people are in high demand.

Q: If you knew you were going to industry anyway would you still get a PhD and would you do anything different?

Guy: That is a very good question and if you look at the web there are a lot of blog posts discussing whether a PhD is worth it if you end up in industry. It's a parallel discussion to the discussion of academic vs. industry careers in that there are pros and cons and no right or wrong answer. It depends on the person, and the specific mindset they have at that point in their life.

In my case, I definitely do not regret doing the PhD. Doing a PhD means losing considerable money (often in terms of opportunity cost), but the way I see it one should optimize for their long-term career rather than for a 5–10 years horizon. If you do a PhD and move to industry, in some sense you lose 5 years, but there will be enough time in your future 30–40 years to make up for that financially and otherwise. The PhD experience will likely be the only time in your life where you can look at a problem and study it and become a real expert at it without distractions. In industry you have distractions like needing to launch a product or fixing an urgent problem, and you may not be able to learn fundamental skills that take years to develop. For example, a deep understanding of machine learning requires deep knowledge of statistics, probability, real analysis, linear algebra, etc. If you are working on an ML product in industry you just don't have the time and peace of mind to dive deep and properly learn all these areas. I also think it is an enjoyable experience and it shapes your personality in some way. There is plenty of time afterwards to worry about product launches and changing the world.

Greg: I have a different perspective. In economics you really need a PhD and in statistics it is getting to the point where you really need a PhD to do scientific work. The reason is that we have a mature discipline now. In the 1950s a very smart creative MS or even BS could become a professor. These days I can pretty much tell who has a PhD and who doesn't based on the way they think. In my area, when you get to the MS level you feel like know everything, and everyone at that level knows everything. If you do a PhD, your thesis advisor says tell me something we have never heard of. You have to demonstrate your creativity and you have to learn how to be creative and solve problems that are ambiguous and aren't well defined. The best theses are in areas where people may not even know that there are problems and as a consequence people are doing things incorrectly. Such significant creativity does not usually come from people that didn't get a PhD since they didn't develop that mindset. And so not having a PhD stands out much more than 60 years ago.

Guy: I agree with Greg, but I don't think it is quite as pronounced in computer science. You do normally need a PhD to be a scientist, but there are a lot of exciting innovations that can be accomplished without a PhD experience. I also want to say that being a scientist is not the only way to do exciting things and make a significant impact. You can be an engineer, a product manager, a people manager, UX designer and you can do amazing things that change the world.

Greg: I agree. I was referring to a research scientist working in areas of statistics or economics. Also, if you want to go the business route a PhD is not very relevant — you should get an MBA instead.

Q: What about publication policy at Amazon and at other companies?

Greg: If you are doing something that is very relevant to Amazon the last thing you want is to publish it and have a competitor get it. At Verizon and GTE lab we had a no or minimal publication policy. We thought Bell labs messed up and gave their stuff away. On the other hand, people who are starting their careers do need to publish, and I am sympathetic to balancing the need for having people know about you and your work. Early in your career it is important to publish, which is why I think it makes sense to have an academic job first.

Guy: I think companies should publish to some extent, assuming their business interests are maintained. Many companies including Amazon have a publication policy where people submit a request to publish a paper and someone goes over the request and decides whether to let you publish, ask you to change the paper, or just prohibit it altogether at this point in time.

Q: What about choosing between a researcher career and a software engineer career?

Guy: I want again to clarify a misconception that researchers in industry are better or more senior than engineers. That is simply not the case. For example my manager is a business person and doesn't have a PhD, but I have a lot of respect for him, his experience and what he accomplished. I have no problem taking directions from a manager, regardless of whether he is a scientist, an engineer, or a business person. Similarly, I have a lot of respect to my colleagues in different job ladders. Unless you are at a place that focuses on pure research (like MSR, though that may be changing there as well), the goal of both scientists and engineers are the same: develop a great product and help the company and its customers. People in different job ladders would have different strengths but everyone works as a team to get things done. A software engineer would have really strong skill set in one area and a scientist would have a really strong skill set in another area.

Q: How did you feel your impact change when you moved from academia to Google and when you moved from Google to Amazon?

Guy: When I was in academia I felt that I had an impact — people cited my work and told me that they liked my paper and that it inspired them. I got invited to give talks and received a few awards. I felt good about myself and my impact. But when I moved to industry and saw how industry is rapidly changing the world, it put the academic impact in perspective. What I previously thought of as big impact all of a sudden became minor and elusive. It was hard to point at concrete long-lasting ripple effects that my work had on the real world (except the influence I had on my students).

When I was at Google, I was officially on sabbatical and I kept the option of going back to academia open. This resulted in some lack of focus and so my impact at Google was mostly scientific. At Amazon I officially resigned from Georgia Tech and jumped with both feet into an industry career. That changed in mindset and resulted in having a more profound product impact.

Greg: I, too, went on sabbatical to GTE Labs and I intended to come back, but afterwards decided to stay in industry. I had a good publication record and citations, but it wasn't the kind of impact that I had at Verizon or that I have here at Amazon. The impact you can have in industry is real and immediate. It's like baking bread in that you work hard and the loaf comes out and you eat it — instant gratification. In academia you submit your article, and then you wait a while and the reviews come back with revise-and-resubmit decision. Then you wait a bit and you go back and do some work, and then resubmit. Perhaps at the end it gets into the journal but by the time people read and ask me about it, I'm working on something else and sometimes I even hardly remember that paper. I remember that somebody asked me about a paper recently and I had to look it up — I forgot I even wrote it. The point is that impact in industry is real and palpable and people note it. Also, about politics: many businesses are political, but what I've seen is that they are less political than any economics or statistics department I've been in. Perhaps Guy can comment on politics in CS.

Guy: At Purdue's ECE department there was quite a bit of politics that actually shocked me as a new assistant professor. The other departments I have been at were not very political.

Q: How do you compare work life balance between industry and academia?

Greg: I think work life balance is a lot better in industry. In academia for the first 10 years people are very focused on getting tenure and then getting to full professor. If you are not working on weekends — you are not going to get promoted. It's as simple as that. At Amazon and at GTE Lab and Verizon you can organize your time so that you have more time for your family. On the other hand, we are all professionals and most of us would do the work for free. I'm the luckiest person in the world since they pay me for doing something that I love doing.

Guy: I had a different experience in that I don't think work life balance at academia has to be much worse. There are about 3 years where I was working particularly hard and I've seen my colleagues work particularly hard (years 2–4 of tenure track). In general, I was at work from 8 in the morning until 5 in the afternoon with occasional extra work later at night if needed. I do the same thing right now in industry. I do think it depends on each person and in both industry and academia there are workaholics or people who are influenced by workaholics to work very long hours.

One nice thing about academia is that work is more flexible and you can work from home (or other places) more. Also, it is really nice to have the academic flexibility during the summer semester.

Read "Rebust Principal Component Analysis" by Candes et al.

This paper is well written and can be used as a good example for training paper-writing skills. The way they wrote the introduction is fantastic and arguably clear. Although reviewers might come up with a lot of negative comments in the very beginning when everything sounds like so "bully", they will eventually convinced by their discussions about the question and its solution in a story-line. To be honest, I like this paper, especially the introduction section.

It is worth to read the proof in great details, but we will left to the future to save time.

Gian-Carlo Rota

MIT, April 20 , 1996 on the occasion of the Rotafest

Allow me to begin by allaying one of your worries. I will not spend the next half hour thanking you for participating in this conference, or for your taking time away from work to travel to Cambridge.

And to allay another of your probable worries, let me add that you are not about to be subjected to a recollection of past events similar to the ones I've been publishing for some years, with a straight face and an occasional embellishment of reality.

Having discarded these two choices for this talk, I was left without a title. Luckily I remembered an MIT colloquium that took place in the late fifties; it was one of the first I attended at MIT. The speaker was Eugenio Calabi. Sitting in the front row of the audience were Norbert Wiener, asleep as usual until the time came to applaud, and Dirk Struik who had been one of Calabi's teachers when Calabi was an undergraduate at MIT in the forties. The subject of the lecture was beyond my competence. After the first five minutes I was completely lost. At the end of the lecture, an arcane dialogue took place between the speaker and some members of the audience, Ambrose and Singer if I remember correctly. There followed a period of tense silence. Professor Struik broke the ice. He raised his hand and said: "Give us something to take home!" Calabi obliged, and in the next five minutes he explained in beautiful simple terms the gist of his lecture. Everybody filed out with a feeling of satisfaction.

Dirk Struik was right: a speaker should try to give his audience something they can take home. But what? I have been collecting some random bits of advice that I keep repeating to myself, do's and don'ts of which I have been and will always be guilty. Some of you have been exposed to one or more of these tidbits. Collecting these items and presenting them in one speech may be one of the less obnoxious among options of equal presumptuousness. The advice we give others is the advice that we ourselves need. Since it is too late for me to learn these lessons, I will discharge my unfulfilled duty by dishing them out to you. They will be stated in order of increasing controversiality.

Lecturing

Blackboard Technique

Publish the same results several times.

You are more likely to be remembered by your expository work.

Every mathematician has only a few tricks.

Do not worry about your mistakes.

Use the Feynmann method.

Give lavish acknowledgments.

Write informative introductions

Be prepared for old age.

1 Lecturing top

The following four requirements of a good lecture do not seem to be altogether obvious, judging from the mathematics lectures I have been listening to for the past forty-six years.

a. Every lecture should make only one main point The German philosopher G. W. F. Hegel wrote that any philosopher who uses the word "and" too often cannot be a good philosopher. I think he was right, at least insofar as lecturing goes. Every lecture should state one main point and repeat it over and over, like a theme with variations. An audience is like a herd of cows, moving slowly in the direction they are being driven towards. If we make one point, we have a good chance that the audience will take the right direction; if we make several points, then the cows will scatter all over the field. The audience will lose interest and everyone will go back to the thoughts they interrupted in order to come to our lecture.

b. Never run overtime Running overtime is the one unforgivable error a lecturer can make. After fifty minutes (one microcentury as von Neumann used to say) everybody's attention will turn elsewhere even if we are trying to prove the Riemann hypothesis. One minute overtime can destroy the best of lectures.

c. Relate to your audience As you enter the lecture hall, try to spot someone in the audience with whose work you have some familiarity. Quickly rearrange your presentation so as to manage to mention some of that person's work. In this way, you will guarantee that at least one person will follow with rapt attention, and you will make a friend to boot.

Everyone in the audience has come to listen to your lecture with the secret hope of hearing their work mentioned.

d. Give them something to take home It is not easy to follow Professor Struik's advice. It is easier to state what features of a lecture the audience will always remember, and the answer is not pretty. I often meet, in airports, in the street and occasionally in embarrassing situations, MIT alumni who have taken one or more courses from me. Most of the time they admit that they have forgotten the subject of the course, and all the mathematics I thought I had taught them. However, they will gladly recall some joke, some anecdote, some quirk, some side remark, or some mistake I made.

2 Blackboard Technique top

Two points.

a. Make sure the blackboard is spotless It is particularly important to erase those distracting whirls that are left when we run the eraser over the blackboard in a non uniform fashion.

By starting with a spotless blackboard, you will subtly convey the impression that the lecture they are about to hear is equally spotless.

b. Start writing on the top left hand corner What we write on the blackboard should correspond to what we want an attentive listener to take down in his notebook. It is preferable to write slowly and in a large handwriting, with no abbreviations. Those members of the audience who are taking notes are doing us a favor, and it is up to us to help them with their copying. When slides are used instead of the blackboard, the speaker should spend some time explaining each slide, preferably by adding sentences that are inessential, repetitive or superfluous, so as to allow any member of the audience time to copy our slide. We all fall prey to the illusion that a listener will find the time to read the copy of the slides we hand them after the lecture. This is wishful thinking.

3 Publish the same result several times top

After getting my degree, I worked for a few years in functional analysis. I bought a copy of Frederick Riesz' Collected Papers as soon as the big thick heavy oversize volume was published. However, as I began to leaf through, I could not help but notice that the pages were extra thick, almost like cardboard. Strangely, each of Riesz' publications had been reset in exceptionally large type. I was fond of Riesz' papers, which were invariably beautifully written and gave the reader a feeling of definitiveness.

As I looked through his Collected Papers however, another picture emerged. The editors had gone out of their way to publish every little scrap Riesz had ever published. It was clear that Riesz' publications were few. What is more surprising is that the papers had been published several times. Riesz would publish the first rough version of an idea in some obscure Hungarian journal. A few years later, he would send a series of notes to the French Academy's Comptes Rendus in which the same material was further elaborated. A few more years would pass, and he would publish the definitive paper, either in French or in English. Adam Koranyi, who took courses with Frederick Riesz, told me that Riesz would lecture on the same subject year after year, while meditating on the definitive version to be written. No wonder the final version was perfect.

Riesz' example is worth following. The mathematical community is split into small groups, each one with its own customs, notation and terminology. It may soon be indispensable to present the same result in several versions, each one accessible to a specific group; the price one might have to pay otherwise is to have our work rediscovered by someone who uses a different language and notation, and who will rightly claim it as his own.

4 You are more likely to be remembered by your expository work top

Let us look at two examples, beginning with Hilbert. When we think of Hilbert, we think of a few of his great theorems, like his basis theorem. But Hilbert's name is more often remembered for his work in number theory, his Zahlbericht, his book Foundations of Geometry and for his text on integral equations. The term "Hilbertspace" was introduced by Stone and von Neumann in recognition of Hilbert's textbook on integral equations, in which the word "spectrum" was first defined at least twenty years before the discovery of quantum mechanics. Hilbert's textbook on integral equations is in large part expository, leaning on the work of Hellinger and several other mathematicians whose names are now forgotten.

Similarly, Hilbert's Foundations of Geometry, the book that made Hilbert's name a household word among mathematicians, contains little original work, and reaps the harvest of the work of several geometers, such as Kohn, Schur (not the Schur you have heard of), Wiener (another Wiener), Pasch, Pieri and several other Italians.

Again, Hilbert's Zahlbericht, a fundamental contribution that revolutionized the field of number theory, was originally a survey that Hilbert was commissioned to write for publication in the Bulletin ofthe German Mathematical Society.

William Feller is another example. Feller is remembered as the author of the most successful treatise on probability ever written. Few probabilists of our day are able to cite more than a couple of Feller's research papers; most mathematicians are not even aware that Feller had a previous life in convex geometry.

Allow me to digress with a personal reminiscence. I sometimes publish in a branch of philosophy called phenomenology. After publishing my first paper in this subject, I felt deeply hurt when, at a meeting of the Society for Phenomenology and Existential Philosophy, I was rudely told in no uncertain terms that everything I wrote in my paper was well known. This scenario occurred more than once, and I was eventually forced to reconsider my publishing standards in phenomenology.

It so happens that the fundamental treatises of phenomenology are written in thick, heavy philosophical German. Tradition demands that no examples ever be given of what one is talking about. One day I decided, not without serious misgivings, to publish a paper that was essentially an updating of some paragraphs from a book by Edmund Husserl, with a few examples added. While I was waiting for the worst at the next meeting of the Society for Phenomenology and Existential Philosophy, a prominent phenomenologist rushed towards me with a smile on his face. He was full of praise for my paper, and he strongly encouraged me to further develop the novel and original ideas presented in it.

5 Every mathematician has only a few tricks top

A long time ago an older and well known number theorist made some disparaging remarks about Paul Erdos' work. You admire contributions to mathematics as much as I do, and I felt annoyed when the older mathematician flatly and definitively stated that all of Erdos' work could be reduced to a few tricks which Erdos repeatedly relied on in his proofs. What the number theorist did not realize is that other mathematicians, even the very best, also rely on a few tricks which they use over and over. Take Hilbert. The second volume of Hilbert's collected papers contains Hilbert's papers in invariant theory. I have made a point of reading some of these papers with care. It is sad to note that some of Hilbert's beautiful results have been completely forgotten. But on reading the proofs of Hilbert's striking and deep theorems in invariant theory, it was surprising to verify that Hilbert's proofs relied on the same few tricks. Even Hilbert had only a few tricks!

6 Do not worry about your mistakes top

Once more let me begin with Hilbert. When the Germans were planning to publish Hilbert's collected papers and to present him with a set on the occasion of one of his later birthdays, they realized that they could not publish the papers in their original versions because they were full of errors, some of them quite serious. Thereupon they hired a young unemployed mathematician, Olga Taussky-Todd, to go over Hilbert's papers and correct all mistakes. Olga labored for three years; it turned out that all mistake scould be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties. At last, on Hilbert's birthday, a freshly printed set of Hilbert's collected papers was presented to the Geheimrat. Hilbert leafed through them carefully and did not notice anything.

Now let us shift to the other end of the spectrum, and allow me to relate another personal anecdote. In the summer of 1979, while attending a philosophy meeting in Pittsburgh, I was struck with a case of detached retinas. Thanks to Joni's prompt intervention, I managed to be operated on in the nick of time and my eyesight was saved.

On the morning after the operation, while I was lying on a hospital bed with my eyes bandaged, Joni dropped in to visit. Since I was to remain in that Pittsburgh hospital for at least a week, we decided to write a paper. Joni fished a manuscript out of my suitcase, and I mentioned to her that the text had a few mistakes which she could help me fix.

There followed twenty minutes of silence while she went through the draft. "Why, it is all wrong!" she finally remarked in her youthful voice. She was right. Every statement in the manuscript had something wrong. Nevertheless, after laboring for a while, she managed to correct every mistake, and the paper was eventually published.

There are two kinds of mistakes. There are fatal mistakes that destroy a theory; but there are also contingent ones, which are useful in testing the stability of a theory.

7 Use the Feynman method top

Richard Feynman was fond of giving the following advice on how to be a genius. You have to keep a dozen of your favorite problems constantly present in your mind, although by and large they will lay in a dormant state. Every time you hear or read a new trick or a new result, test it against each of your twelve problems to see whether it helps. Every once in a while there will be a hit, and people will say: "How did he do it? He must be a genius!"

8 Give lavish acknowledgments top

I have always felt miffed after reading a paper in which I felt I was not being given proper credit, and it is safe to conjecture that the same happens to everyone else. One day, I tried an experiment. After writing a rather long paper, I began to draft a thorough bibliography. On the spur of the moment, I decided to cite a few papers which had nothing whatsoever to do with the content of my paper, to see what might happen.

Somewhat to my surprise, I received letters from two of the authors whose papers I believed were irrelevant to my article. Both letters were written in an emotionally charged tone. Each of the authors warmly congratulated me for being the first to acknowledge their contribution to the field.

9 Write informative introductions top

Nowadays, reading a mathematics paper from top to bottom is a rare event. If we wish our paper to be read, we had better provide our prospective readers with strong motivation to do so. A lengthy introduction, summarizing the history of the subject, giving everybody his due, and perhaps enticingly outlining the content of the paper in a discursive manner, will go some of the way towards getting us a couple of readers.

As the editor of the journal Advances in Mathematics, I have often sent submitted papers back to the authors with the recommendation that they lengthen their introduction. On occasion I received by return mail a message from the author, stating that the same paper had been previously rejected by Annals of Mathematics because the introduction was already too long.

10 Be prepared for old age top

My late friend Stan Ulam used to remark that his life was sharply divided into two halves. In the first half, he was always the youngest person in the group; in the second half, he was always the oldest. There was no transitional period.

I now realize how right he was. The etiquette of old age does not seem to have been written up, and we have to learn it the hard way. It depends on a basic realization, which takes time to adjust to. You must realize that, after reaching a certain age, you are no longer viewed as a person. You become an institution, and you are treated the way institutions are treated. You are expected to behave like a piece of period furniture, an architectural landmark, or an incunabulum.

It matters little whether you keep publishing or not. If your papers are no good, they will say, "What did you expect? He is a fixture!" and if an occasional paper of yours is found to be interesting, they will say, "What did you expect? He has been working at this all his life!" The only sensible response is to enjoy playing your newly-found role as an institution.

From Machine Learning to Machine Reasoning by Léon Bottou

This paper points out a new direction of machine learning. When building up bigger machine learning systems, probalistic modeling is not enough. The machine reasoning component should kick in. However, there is no sophisticated research on this. It makes machine reasoning with causality consideration as an important new direction.

Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts

Big-data boondoggles and brain-inspired chips are just two of the things we’re really getting wrong

By Lee Gomes

Posted 20 Oct 2014 | 19:37 GMT

Photo-Illustration: Randi Klett

The overeager adoption of big data is likely to result in catastrophes of analysis comparable to a national epidemic of collapsing bridges. Hardware designers creating chips based on the human brain are engaged in a faith-based undertaking likely to prove a fool’s errand. Despite recent claims to the contrary, we are no further along with computer vision than we were with physics when Isaac Newton sat under his apple tree.

Those may sound like the Luddite ravings of a crackpot who breached security at an IEEE conference. In fact, the opinions belong to IEEE Fellow Michael I. Jordan, Pehong Chen Distinguished Professor at the University of California, Berkeley. Jordan is one of the world’s most respected authorities on machine learning and an astute observer of the field. His CV would require its own massive database, and his standing in the field is such that he was chosen to write the introduction to the 2013 National Research Council report “Frontiers in Massive Data Analysis.” San Francisco writer Lee Gomes interviewed him for IEEE Spectrum on 3 October 2014.

Michael Jordan on…

Why We Should Stop Using Brain Metaphors When We Talk About Computing

IEEE Spectrum: I infer from your writing that you believe there’s a lot of misinformation out there about deep learning, big data, computer vision, and the like.

Michael Jordan: Well, on all academic topics there is a lot of misinformation. The media is trying to do its best to find topics that people are going to read about. Sometimes those go beyond where the achievements actually are. Specifically on the topic of deep learning, it’s largely a rebranding of neural networks, which go back to the 1980s. They actually go back to the 1960s; it seems like every 20 years there is a new wave that involves them. In the current wave, the main success story is the convolutional neural network, but that idea was already present in the previous wave. And one of the problems with both the previous wave, that has unfortunately persisted in the current wave, is that people continue to infer that something involving neuroscience is behind it, and that deep learning is taking advantage of an understanding of how the brain processes information, learns, makes decisions, or copes with large amounts of data. And that is just patently false.

Spectrum: As a member of the media, I take exception to what you just said, because it’s very often the case that academics are desperate for people to write stories about them.

Michael Jordan: Yes, it’s a partnership.

Spectrum: It’s always been my impression that when people in computer science describe how the brain works, they are making horribly reductionist statements that you would never hear from neuroscientists. You called these “cartoon models” of the brain.

Michael Jordan: I wouldn’t want to put labels on people and say that all computer scientists work one way, or all neuroscientists work another way. But it’s true that with neuroscience, it’s going to require decades or even hundreds of years to understand the deep principles. There is progress at the very lowest levels of neuroscience. But for issues of higher cognition—how we perceive, how we remember, how we act—we have no idea how neurons are storing information, how they are computing, what the rules are, what the algorithms are, what the representations are, and the like. So we are not yet in an era in which we can be using an understanding of the brain to guide us in the construction of intelligent systems.

Spectrum: In addition to criticizing cartoon models of the brain, you actually go further and criticize the whole idea of “neural realism”—the belief that just because a particular hardware or software system shares some putative characteristic of the brain, it’s going to be more intelligent. What do you think of computer scientists who say, for example, “My system is brainlike because it is massively parallel.”

Michael Jordan: Well, these are metaphors, which can be useful. Flows and pipelines are metaphors that come out of circuits of various kinds. I think in the early 1980s, computer science was dominated by sequential architectures, by the von Neumann paradigm of a stored program that was executed sequentially, and as a consequence, there was a need to try to break out of that. And so people looked for metaphors of the highly parallel brain. And that was a useful thing.

But as the topic evolved, it was not neural realism that led to most of the progress. The algorithm that has proved the most successful for deep learning is based on a technique called back propagation. You have these layers of processing units, and you get an output from the end of the layers, and you propagate a signal backwards through the layers to change all the parameters. It’s pretty clear the brain doesn’t do something like that. This was definitely a step away from neural realism, but it led to significant progress. But people tend to lump that particular success story together with all the other attempts to build brainlike systems that haven’t been nearly as successful.

Spectrum: Another point you’ve made regarding the failure of neural realism is that there is nothing very neural about neural networks.

Michael Jordan: There are no spikes in deep-learning systems. There are no dendrites. And they have bidirectional signals that the brain doesn’t have.

We don’t know how neurons learn. Is it actually just a small change in the synaptic weight that’s responsible for learning? That’s what these artificial neural networks are doing. In the brain, we have precious little idea how learning is actually taking place.

Spectrum: I read all the time about engineers describing their new chip designs in what seems to me to be an incredible abuse of language. They talk about the “neurons” or the “synapses” on their chips. But that can’t possibly be the case; a neuron is a living, breathing cell of unbelievable complexity. Aren’t engineers appropriating the language of biology to describe structures that have nothing remotely close to the complexity of biological systems?

Michael Jordan: Well, I want to be a little careful here. I think it’s important to distinguish two areas where the word neural is currently being used.

One of them is in deep learning. And there, each “neuron” is really a cartoon. It’s a linear-weighted sum that’s passed through a nonlinearity. Anyone in electrical engineering would recognize those kinds of nonlinear systems. Calling that a neuron is clearly, at best, a shorthand. It’s really a cartoon. There is a procedure called logistic regression in statistics that dates from the 1950s, which had nothing to do with neurons but which is exactly the same little piece of architecture.

A second area involves what you were describing and is aiming to get closer to a simulation of an actual brain, or at least to a simplified model of actual neural circuitry, if I understand correctly. But the problem I see is that the research is not coupled with any understanding of what algorithmically this system might do. It’s not coupled with a learning system that takes in data and solves problems, like in vision. It’s really just a piece of architecture with the hope that someday people will discover algorithms that are useful for it. And there’s no clear reason that hope should be borne out. It is based, I believe, on faith, that if you build something like the brain, that it will become clear what it can do.

Spectrum: If you could, would you declare a ban on using the biology of the brain as a model in computation?

Michael Jordan: No. You should get inspiration from wherever you can get it. As I alluded to before, back in the 1980s, it was actually helpful to say, “Let’s move out of the sequential, von Neumann paradigm and think more about highly parallel systems.” But in this current era, where it’s clear that the detailed processing the brain is doing is not informing algorithmic process, I think it’s inappropriate to use the brain to make claims about what we’ve achieved. We don’t know how the brain processes visual information.

Back to top
Our Foggy Vision About Machine Vision

Spectrum: You’ve used the word hype in talking about vision system research. Lately there seems to be an epidemic of stories about how computers have tackled the vision problem, and that computers have become just as good as people at vision. Do you think that’s even close to being true?

Michael Jordan: Well, humans are able to deal with cluttered scenes. They are able to deal with huge numbers of categories. They can deal with inferences about the scene: “What if I sit down on that?” “What if I put something on top of something?” These are far beyond the capability of today’s machines. Deep learning is good at certain kinds of image classification. “What object is in this scene?”

But the computational vision problem is vast. It’s like saying when that apple fell out of the tree, we understood all of physics. Yeah, we understood something more about forces and acceleration. That was important. In vision, we now have a tool that solves a certain class of problems. But to say it solves all problems is foolish.

Spectrum: How big of a class of problems in vision are we able to solve now, compared with the totality of what humans can do?

Michael Jordan: With face recognition, it’s been clear for a while now that it can be solved. Beyond faces, you can also talk about other categories of objects: “There’s a cup in the scene.” “There’s a dog in the scene.” But it’s still a hard problem to talk about many kinds of different objects in the same scene and how they relate to each other, or how a person or a robot would interact with that scene. There are many, many hard problems that are far from solved.

Spectrum: Even in facial recognition, my impression is that it still only works if you’ve got pretty clean images to begin with.

Michael Jordan: Again, it’s an engineering problem to make it better. As you will see over time, it will get better. But this business about “revolutionary” is overwrought.

Back to top
Why Big Data Could Be a Big Fail

Spectrum: If we could turn now to the subject of big data, a theme that runs through your remarks is that there is a certain fool’s gold element to our current obsession with it. For example, you’ve predicted that society is about to experience an epidemic of false positives coming out of big-data projects.

Michael Jordan: When you have large amounts of data, your appetite for hypotheses tends to get even larger. And if it’s growing faster than the statistical strength of the data, then many of your inferences are likely to be false. They are likely to be white noise.

Spectrum: How so?

Michael Jordan: In a classical database, you have maybe a few thousand people in them. You can think of those as the rows of the database. And the columns would be the features of those people: their age, height, weight, income, et cetera.

Now, the number of combinations of these columns grows exponentially with the number of columns. So if you have many, many columns—and we do in modern databases—you’ll get up into millions and millions of attributes for each person.

Now, if I start allowing myself to look at all of the combinations of these features—if you live in Beijing, and you ride bike to work, and you work in a certain job, and are a certain age—what’s the probability you will have a certain disease or you will like my advertisement? Now I’m getting combinations of millions of attributes, and the number of such combinations is exponential; it gets to be the size of the number of atoms in the universe.

Those are the hypotheses that I’m willing to consider. And for any particular database, I will find some combination of columns that will predict perfectly any outcome, just by chance alone. If I just look at all the people who have a heart attack and compare them to all the people that don’t have a heart attack, and I’m looking for combinations of the columns that predict heart attacks, I will find all kinds of spurious combinations of columns, because there are huge numbers of them.

So it’s like having billions of monkeys typing. One of them will write Shakespeare.

Spectrum:Do you think this aspect of big data is currently underappreciated?

Michael Jordan: Definitely.

Spectrum: What are some of the things that people are promising for big data that you don’t think they will be able to deliver?

Michael Jordan: I think data analysis can deliver inferences at certain levels of quality. But we have to be clear about what levels of quality. We have to have error bars around all our predictions. That is something that’s missing in much of the current machine learning literature.

Spectrum: What will happen if people working with data don’t heed your advice?

Michael Jordan: I like to use the analogy of building bridges. If I have no principles, and I build thousands of bridges without any actual science, lots of them will fall down, and great disasters will occur.

Similarly here, if people use data and inferences they can make with the data without any concern about error bars, about heterogeneity, about noisy data, about the sampling pattern, about all the kinds of things that you have to be serious about if you’re an engineer and a statistician—then you will make lots of predictions, and there’s a good chance that you will occasionally solve some real interesting problems. But you will occasionally have some disastrously bad decisions. And you won’t know the difference a priori. You will just produce these outputs and hope for the best.

And so that’s where we are currently. A lot of people are building things hoping that they work, and sometimes they will. And in some sense, there’s nothing wrong with that; it’s exploratory. But society as a whole can’t tolerate that; we can’t just hope that these things work. Eventually, we have to give real guarantees. Civil engineers eventually learned to build bridges that were guaranteed to stand up. So with big data, it will take decades, I suspect, to get a real engineering approach, so that you can say with some assurance that you are giving out reasonable answers and are quantifying the likelihood of errors.

Spectrum: Do we currently have the tools to provide those error bars?

Michael Jordan: We are just getting this engineering science assembled. We have many ideas that come from hundreds of years of statistics and computer science. And we’re working on putting them together, making them scalable. A lot of the ideas for controlling what are called familywise errors, where I have many hypotheses and want to know my error rate, have emerged over the last 30 years. But many of them haven’t been studied computationally. It’s hard mathematics and engineering to work all this out, and it will take time.

It’s not a year or two. It will take decades to get right. We are still learning how to do big data well.

Spectrum: When you read about big data and health care, every third story seems to be about all the amazing clinical insights we’ll get almost automatically, merely by collecting data from everyone, especially in the cloud.

Michael Jordan: You can’t be completely a skeptic or completely an optimist about this. It is somewhere in the middle. But if you list all the hypotheses that come out of some analysis of data, some fraction of them will be useful. You just won’t know which fraction. So if you just grab a few of them—say, if you eat oat bran you won’t have stomach cancer or something, because the data seem to suggest that—there’s some chance you will get lucky. The data will provide some support.

But unless you’re actually doing the full-scale engineering statistical analysis to provide some error bars and quantify the errors, it’s gambling. It’s better than just gambling without data. That’s pure roulette. This is kind of partial roulette.

Spectrum: What adverse consequences might await the big-data field if we remain on the trajectory you’re describing?

Michael Jordan: The main one will be a “big-data winter.” After a bubble, when people invested and a lot of companies overpromised without providing serious analysis, it will bust. And soon, in a two- to five-year span, people will say, “The whole big-data thing came and went. It died. It was wrong.” I am predicting that. It’s what happens in these cycles when there is too much hype, i.e., assertions not based on an understanding of what the real problems are or on an understanding that solving the problems will take decades, that we will make steady progress but that we haven’t had a major leap in technical progress. And then there will be a period during which it will be very hard to get resources to do data analysis. The field will continue to go forward, because it’s real, and it’s needed. But the backlash will hurt a large number of important projects.

Back to top
What He’d Do With $1 Billion

Spectrum: Considering the amount of money that is spent on it, the science behind serving up ads still seems incredibly primitive. I have a hobby of searching for information about silly Kickstarter projects, mostly to see how preposterous they are, and I end up getting served ads from the same companies for many months.

Michael Jordan: Well, again, it’s a spectrum. It depends on how a system has been engineered and what domain we’re talking about. In certain narrow domains, it can be very good, and in very broad domains, where the semantics are much murkier, it can be very poor. I personally find Amazon’s recommendation system for books and music to be very, very good. That’s because they have large amounts of data, and the domain is rather circumscribed. With domains like shirts or shoes, it’s murkier semantically, and they have less data, and so it’s much poorer.

There are still many problems, but the people who build these systems are hard at work on them. What we’re getting into at this point is semantics and human preferences. If I buy a refrigerator, that doesn’t show that I am interested in refrigerators in general. I’ve already bought my refrigerator, and I’m probably not likely to still be interested in them. Whereas if I buy a song by Taylor Swift, I’m more likely to buy more songs by her. That has to do with the specific semantics of singers and products and items. To get that right across the wide spectrum of human interests requires a large amount of data and a large amount of engineering.

Spectrum: You’ve said that if you had an unrestricted $1 billion grant, you would work on natural language processing. What would you do that Google isn’t doing with Google Translate?

Michael Jordan: I am sure that Google is doing everything I would do. But I don’t think Google Translate, which involves machine translation, is the only language problem. Another example of a good language problem is question answering, like “What’s the second-biggest city in California that is not near a river?” If I typed that sentence into Google currently, I’m not likely to get a useful response.

Spectrum:So are you saying that for a billion dollars, you could, at least as far as natural language is concerned, solve the problem of generalized knowledge and end up with the big enchilada of AI: machines that think like people?

Michael Jordan: So you’d want to carve off a smaller problem that is not about everything, but which nonetheless allows you to make progress. That’s what we do in research. I might take a specific domain. In fact, we worked on question-answering in geography. That would allow me to focus on certain kinds of relationships and certain kinds of data, but not everything in the world.

Spectrum: So to make advances in question answering, will you need to constrain them to a specific domain?

Michael Jordan: It’s an empirical question about how much progress you could make. It has to do with how much data is available in these domains. How much you could pay people to actually start to write down some of those things they knew about these domains. How many labels you have.

Spectrum: It seems disappointing that even with a billion dollars, we still might end up with a system that isn’t generalized, but that only works in just one domain.

Michael Jordan: That’s typically how each of these technologies has evolved. We talked about vision earlier. The earliest vision systems were face-recognition systems. That’s domain bound. But that’s where we started to see some early progress and had a sense that things might work. Similarly with speech, the earliest progress was on single detached words. And then slowly, it started to get to be where you could do whole sentences. It’s always that kind of progression, from something circumscribed to something less and less so.

Spectrum: Why do we even need better question-answering? Doesn’t Google work well enough as it is?

Michael Jordan: Google has a very strong natural language group working on exactly this, because they recognize that they are very poor at certain kinds of queries. For example, using the word not. Humans want to use the word not. For example, “Give me a city that is not near a river.” In the current Google search engine, that’s not treated very well.

Back to top
How Not to Talk About the Singularity

Spectrum: Turning now to some other topics, if you were talking to someone in Silicon Valley, and they said to you, “You know, Professor Jordan, I’m a really big believer in the singularity,” would your opinion of them go up or down?

Michael Jordan: I luckily never run into such people.

Spectrum: Oh, come on.

Michael Jordan: I really don’t. I live in an intellectual shell of engineers and mathematicians.

Spectrum: But if you did encounter someone like that, what would you do?

Michael Jordan: I would take off my academic hat, and I would just act like a human being thinking about what’s going to happen in a few decades, and I would be entertained just like when I read science fiction. It doesn’t inform anything I do academically.

Spectrum: Okay, but knowing what you do academically, what do you think about it?

Michael Jordan: My understanding is that it’s not an academic discipline. Rather, it’s partly philosophy about how society changes, how individuals change, and it’s partly literature, like science fiction, thinking through the consequences of a technology change. But they don’t produce algorithmic ideas as far as I can tell, because I don’t ever see them, that inform us about how to make technological progress.

Back to top
What He Cares About More Than Whether P = NP

Spectrum: Do you have a guess about whether P = NP? Do you care?

Michael Jordan: I tend to be not so worried about the difference between polynomial and exponential. I’m more interested in low-degree polynomial—linear time, linear space. P versus NP has to do with categorization of algorithms as being polynomial, which means they are tractable and exponential, which means they’re not.

I think most people would agree that probably P is not equal to NP. As a piece of mathematics, it’s very interesting to know. But it’s not a hard and sharp distinction. There are many exponential time algorithms that, partly because of the growth of modern computers, are still viable in certain circumscribed domains. And moreover, for the largest problems, polynomial is not enough. Polynomial just means that it grows at a certain superlinear rate, like quadric or cubic. But it really needs to grow linearly. So if you get five more data points, you need five more amounts of processing. Or even sublinearly, like logarithmic. As I get 100 new data points, it grows by two; if I get 1,000, it grows by three.

That’s the ideal. Those are the kinds of algorithms we have to focus on. And that is very far away from the P versus NP issue. It’s a very important and interesting intellectual question, but it doesn’t inform that much about what we work on.

Spectrum: Same question about quantum computing.

Michael Jordan: I am curious about all these things academically. It’s real. It’s interesting. It doesn’t really have an impact on my area of research.

Back to top
What the Turing Test Really Means

Spectrum: Will a machine pass the Turing test in your lifetime?

Michael Jordan: I think you will get a slow accumulation of capabilities, including in domains like speech and vision and natural language. There will probably not ever be a single moment in which we would want to say, “There is now a new intelligent entity in the universe.” I think that systems like Google already provide a certain level of artificial intelligence.

Spectrum: They are definitely useful, but they would never be confused with being a human being.

Michael Jordan: No, they wouldn’t be. I don’t think most of us think the Turing test is a very clear demarcation. Rather, we all know intelligence when we see it, and it emerges slowly in all the devices around us. It doesn’t have to be embodied in a single entity. I can just notice that the infrastructure around me got more intelligent. All of us are noticing that all of the time.

Spectrum: When you say “intelligent,” are you just using it as a synonym for “useful”?

Michael Jordan: Yes. What our generation finds surprising—that a computer recognizes our needs and wants and desires, in some ways—our children find less surprising, and our children’s children will find even less surprising. It will just be assumed that the environment around us is adaptive; it’s predictive; it’s robust. That will include the ability to interact with your environment in natural language. At some point, you’ll be surprised by being able to have a natural conversation with your environment. Right now we can sort of do that, within very limited domains. We can access our bank accounts, for example. They are very, very primitive. But as time goes on, we will see those things get more subtle, more robust, more broad. As some point, we’ll say, “Wow, that’s very different when I was a kid.” The Turing test has helped get the field started, but in the end, it will be sort of like Groundhog Day—a media event, but something that’s not really important.

Back to top

About the Author

Lee Gomes, a former Wall Street Journal reporter, has been covering Silicon Valley for more than two decades.

订阅：博文 (Atom)

阳光海岸线

Academic Careers vs. Industry Careers

Academic Careers vs. Industry Careers

Read "Rebust Principal Component Analysis" by Candes et al.

如何应对压力和焦虑的策略和技巧

怎么培养意志力

如何集中注意力

Ten Lessons I wish I had been Taught (zz)

From Machine Learning to Machine Reasoning by Léon Bottou

Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts

Big-data boondoggles and brain-inspired chips are just two of the things we’re really getting wrong

Why We Should Stop Using Brain Metaphors When We Talk About Computing

Our Foggy Vision About Machine Vision

Why Big Data Could Be a Big Fail

What He’d Do With $1 Billion

How Not to Talk About the Singularity

What He Cares About More Than Whether P = NP

What the Turing Test Really Means

About the Author

搜索此博客

关注者