Math and Stats

2017 Trjitzinsky Awards

AMS Feed - Thu, 2017-10-12 23:00

The AMS has made $3,000 awards to seven undergraduate students (at left) through the Waldemar J. Trjitzinsky Memorial Fund, which is made possible by a bequest from the estate of Waldemar J., Barbara G., and Juliette Trjitzinsky. Read about this year's awardees.

Categories: Math and Stats

Candès Named MacArthur Fellow

AMS Feed - Tue, 2017-10-10 23:00

Emmanuel Candès, Stanford University, is among the class of 2017 MacArthur Fellows. He is "known for developing a unified framework for addressing a range of problems in engineering and computer science, most notably compressed sensing. Candès's research focuses on reconstructing high-resolution images from small numbers of random measurements, as well as recovering the missing entries in massive data tables." Watch him explain his work.
Candès is an AMS member and is on the Editorial Board for Bulletin of the AMS. He received the 2015 AMS-SIAM George David Birkhoff Prize in Applied Mathematics and he gave the 2011 Erdős Memorial Lecture. Hear Candès talk about his research in a Mathematical Moment on compressed sensing. Awarding unrestricted fellowships to talented individuals who have shown extraordinary originality and dedication in their creative pursuits and a marked capacity for self-direction is the purpose of the the MacArthur Fellowship. Each fellowship includes a $625,000, no-strings-attached award to extraordinarily talented and creative individuals as an investment in their potential.

Categories: Math and Stats

Bridges 2017: Highlights of the Conference on Mathematics Connections in Art, Music, and Science

AMS Feed - Tue, 2017-10-03 23:00

The 2017 Bridges Math and Arts conference drew mathematicians, scientists, artists, educators, musicians, poets, computer scientists, sculptors, dancers, weavers and model builders from around the world. Read the AMS report with photo slideshows.

Categories: Math and Stats

Alison Etheridge: President’s Message

IMS Bulletin - Mon, 2017-10-02 11:26

Jon Wellner (left) handed the President’s gavel to Alison Etheridge at JSM—a statistics meeting with a surprising amount for a probabilist like her to enjoy

Alison Etheridge is IMS President, 2017–18. She writes:

Well, that was it. Jon Wellner handed over the gavel (and I rapidly handed it back to Elyse for safe keeping). Tati took a picture and, before I had a chance to think about it, cheerily persuaded me to write something for the Bulletin. And now it seems that I am President of the IMS (and writing something for the Bulletin).

It was just the next daunting event in a slightly overwhelming week. The JSM is on a completely different scale to any meeting that I’ve ever before attended. Representing the IMS with Jon at the First-Time Attendee Orientation, I was too embarrassed to reveal the first timer’s ribbon lurking in my own pocket. But what a wonderful experience it was. I was surrounded by excited young people, all of whom were far more prepared than me. Yes, they had managed to download a version of the app that worked (not something that I had even tried with my aged not-very-smartphone); yes, they knew the opening hours of the Expo; and yes, they had planned their schedules. Jon reminded me that my own schedule required me to be elsewhere and we left a vast hall, buzzing with energy.

That was the first of many receptions at the JSM, where I met many more old friends than I expected and made plenty of new ones. Everybody was curious to know what I was setting out to achieve as President. I’m not sure that I found a very satisfactory answer. At one level, the answer is not much; the IMS is in great shape and my primary aim should be not to do too much damage. On the other hand, in an organisation that is so dependent upon electronic communication, it is especially important to make sure that initiatives don’t lose momentum, and I should certainly like to nudge along some of the activities that were kicked off by my predecessors.

Lacking a more imaginative agenda, when I set out to write this piece, I decided to turn to those predecessors for inspiration. What had they written? I think all of them agreed that a year is anyway too short a time in which to change very much. And there were other recurrent themes: IMS ‘groups’, data science, and a concern that probability and statistics — or at least probabilists and statisticians — might be moving apart. So where are we now?

Even having made my research base in a Department of Statistics for the last twenty years, I would still never describe myself as a statistician. And when I signed up for the JSM it was solely because of the IMS annual meeting; I didn’t even check out the scientific programme before registering. Indeed, I briefly wondered how I’d fill my days. In the event, I had quite the opposite problem. Here are a few of the keywords from the lectures that I attended: random forests,
random networks, random matrices, Erdős-Rényi graphs, Wasserstein distances, differential geometry… Any probabilists out there feeling jealous? If there is any separation between probabilists and statisticians, it is certainly not built on a chasm between the underlying disciplines. On the other hand, statistics and probability are growing at an unprecedented rate, and it becomes ever more difficult to maintain any sense of cohesion across the piece. Here, the IMS has an important role to play in providing the necessary glue.

With governments and funding agencies in many parts of the world increasingly focused on “goal-oriented” research, with evident (and preferably imminent) financial or societal benefits, many theoretical researchers can feel sidelined. But experience shows that not only does this more applied research stimulate exciting theoretical research, but very often the theoreticians have spent decades developing just the tools that are required for the application. Think of financial mathematics. Bachelier’s famous thesis introduced the idea of using Brownian motion in option pricing in 1900, but it was essentially forgotten by economists for half a century. When it was rediscovered by economists in the 1950s, not only was the economics surprisingly relevant, but also stochastic analysis had matured to just the point where the tools were in place for the Nobel Prize-winning theory of option pricing to be developed. The pioneers of stochastic analysis were not motivated by the applications in finance, yet no goal-oriented research programme could have developed a better toolkit. And of course, stochastic analysis has much broader applications. In an era of big data, we are seeing the stochastic analysis story replicated many times over. It becomes ever clearer that we absolutely need a community to be developing the theoretical tools and structures, but that to exploit those increasingly sophisticated tools effectively requires communication and collaboration. By bringing so many people together under a single umbrella, the IMS offers outstanding opportunities for cross-fertilisation of ideas across our vast discipline.

IMS Groups

To be successful in this, in a rapidly changing scientific landscape, we must be effective and agile in engaging with our members—and that was one of the motivations for setting up the IMS groups. I should emphasize that groups are certainly not intended to be exclusive; it is perfectly reasonable to be associated with multiple groups, but they should facilitate more targeted engagement. The IMS groups have been around for seven or eight years, but, on the whole, I think that it is fair to say, they have not really taken off. The most obvious exception is the New Researchers Group, which has really flourished in the last couple of years: Richard Davis announced its existence in his President’s article two years ago, but the New Researchers Committee has existed for much longer, and I think that the committee will be the key to providing the continuity and “institutional memory” necessary for the group’s ongoing success. The New Researchers Committee focuses on ensuring the continuance of the New Researchers Conference (NRC), a robust web presence, and fostering new ways for young researchers to meet, collaborate, and share their experience. This last activity is greatly enhanced by the first: the scientific programme of the NRC is interspersed with discussion panels, as I experienced first-hand in NRC 2018. I have no idea how useful the comments of panel members were to the New Researchers (not least as, in large part due to the UK/US language barrier, this panel member only understood the topic of the panel she was on five minutes after it began), but the value of early career academics sharing their ideas and experience in a lively and positive environment is indisputable. I enjoyed myself enormously and I have already signed up for next year.

Of course there is still more for the New Researchers Group to do, such as increasing its global reach, but my hope is that lessons learned from their successes can be transferred to other areas. I suspect that each group will need some sort of steering committee to play a similar role to the New Researchers Committee and ensure that the leadership can be refreshed and passed on in a regular cycle. The first test case will be Data Science.

Data Science Group

The Data Science group was also announced by Richard in his piece two years ago. To my delight, Sofia Olhede and Patrick Wolfe have volunteered to take on its leadership at what I see as a crucial time. Whether one thinks that statisticians and probabilists should take ownership of data science, or instead clear up the mess left behind by the savvy, computationally adept, applications-driven researchers who increasingly dominate the area, it is clear that we should embrace data science. Who better than an IMS group to help define and emphasize the role of statistics and probability? I won’t steal their thunder, Sofia and Patrick have promised to write a piece for the Bulletin themselves, but I know that they’d welcome your input, so please do contact them at datascience@imstat.org if you would like to be involved (or even if you just want to make a suggestion).

Groups offer a clear conduit for the flow of ideas between IMS members and those they have entrusted with the leadership of the organisation. They provide a first port of call for the leadership when they are seeking expertise in particular areas and a natural mechanism for members to discuss concerns pertinent to their own interests and raise them with the leadership. We have discovered that most groups struggle to survive in the long term without some intervention, but I very much hope that we can renew our groups and with some minimal governance structures ensure that they adapt and grow with the scientific environment.

Like all our activities, groups rely on input from members. I am acutely aware of the pressures on everyone’s time and am deeply grateful for the time and energy that so many people put into ensuring that the scientific activities of the IMS are of the very highest quality. A concern that I have mirrors my experience in other parts of my academic life. In refreshing the committee membership for 2017–18, I had to guard against always calling on the “usual suspects” or people I’d previously worked with, not least as they are slightly more likely to respond to my emails than people I have never (e-)met. How can we involve more members?

The selfless contributions of members benefit the entire community, members and non-members alike. IMS membership has been falling—not rapidly, but enough to make me feel that we need to find a better way to articulate the benefits of membership. At first sight it seems that by joining, one just exposes oneself to the risk of being asked to take on yet more work. But turning this on its head, by joining, one is in a position to help shape the scientific programme of the IMS and make an important contribution to the profession. I did ask a few people at the JSM what they saw as the benefits of membership, or why they had allowed IMS membership to lapse. One person I spoke to, who had recently rejoined, expressed regret that he had ever allowed membership to lapse. Why? “I really like being part of the IMS community.” I myself joined because Ruth Williams asked me to serve on the Committee on Special Lectures, but added that she seemed to remember that I wasn’t actually a member of the IMS, and this was a requirement, so would I please join. I always do as Ruth asks, and so I joined. I have never looked back. From my perspective, perhaps the most rewarding aspect of being involved has been the opportunity to meet so many truly remarkable people.

I left the JSM with an immense sense of pride in the IMS. The IMS is a badge of academic quality; we publish outstanding journals and our Committee on Special Lectures had excelled in their contribution to the scientific programme. But most of all, the IMS is a community of scholars that supports and nourishes talent from right across the spectrum of our discipline. I think that I am looking forward to the next ten months. I have Jon, Xiao-Li, the Council and of course Elyse to bounce ideas off, to keep me on the straight and narrow, and to ensure that things actually happen. But I need your help too. Tell me what you want from your society. What do you like about the IMS? What do you dislike about the IMS? And, of course, if you would like to be more involved in any aspect of our activities, or if you have suggestions for improving the IMS or for new IMS initiatives, then please don’t hesitate to contact me at president@imstat.org.

Finally, here is your homework assignment:

1. What should the IMS do to remain relevant to our members?

2. How can we articulate the benefits of membership?

3. What more can we do to increase awareness of each other’s
academic activities?

…and a bonus question:

4. Can anyone think of a single term that captures probability and statistics?

Send in solutions to the President’s mailbox as soon as you can—I don’t have long.

Categories: Math and Stats

Rao Prize Conference awards

IMS Bulletin - Mon, 2017-10-02 11:18

James L. Rosenberger (left) presenting the 2017 C.R. and Bhargavi Rao Prize to Donald B. Rubin

The Penn State Department of Statistics held the 2017 Rao Prize Conference on May 12, 2017, and honored three outstanding prize recipients (and IMS members). Donald B. Rubin, the John L. Loeb Professor of Statistics at Harvard University, received the 2017 C.R. and Bhargavi Rao Prize (pictured above). The 2017 C. G. Khatri Lecturer was Paul R. Rosenbaum, the Robert G. Putzel Professor of Statistics at the Wharton School; and the 2017 P. R. Krishnaiah Lecturer was Satish Iyengar, who is Professor of Statistics at the University of Pittsburgh.

IMS Fellow Donald B. Rubin is one of the most highly cited authors in the world in mathematics and economics. He is an elected Fellow/Member/Honorary Member of: the Woodrow Wilson Society, John Simon Guggenheim Memorial Foundation, Alexander von Humboldt Foundation, American Statistical Association, IMS, International Statistical Institute, American Association for the Advancement of Science, American Academy of Arts and Sciences, European Association of Methodology, British Academy, and the US National Academy of Sciences. Rubin received the Samuel S. Wilks Medal from the American Statistical Association, the Parzen Prize for Statistical Innovation, the Fisher Lectureship, and the COPSS George W. Snedecor Award.

A report of the conference is here.

Categories: Math and Stats

Karen Kafadar Elected ASA President

IMS Bulletin - Mon, 2017-10-02 11:16

Karen Kafadar, chair and commonwealth professor in the department of statistics at the University of Virginia, has been elected the American Statistical Association’s 114th President. She will serve a one-year term as president-elect beginning January 1, 2018; her term as president becomes effective January 1, 2019. Karen’s research interests focus on robust methods; exploratory data analysis; characterization of uncertainty in the physical, chemical, biological, and engineering sciences; and methodology for the analysis of screening trials.

Categories: Math and Stats

Walter A. Rosenkrantz passed away

IMS Bulletin - Mon, 2017-10-02 11:16

Walter A. Rosenkrantz, who passed away September 19, 2017, was Emeritus Professor at the Department of Mathematics and Statistics of the University of Massachusetts, Amherst. He had spent over 30 years at UMass. His Bachelor’s degree was from University of Chicago in 1957, and his MS (1959) and PhD (1963) were from the University of Illinois. He wrote a textbook, Introduction to Probability and Statistics for Scientists and Engineers, in 1997 (McGraw-Hill). An obituary will follow.

Categories: Math and Stats

Nominate for Awards, Prizes

IMS Bulletin - Mon, 2017-10-02 11:15
IMS Awards

It’s time to think about nominating your outstanding colleagues and collaborators for these IMS awards: Tweedie award, Carver medal and IMS Fellowship.

The Tweedie New Researcher Award funds travel to present the Tweedie New Researcher Invited Lecture at the IMS New Researchers Conference. It was created in memory of Richard Tweedie, who mentored many young colleagues. New researchers (who received their PhD in 2012–2017), who are members of IMS, are eligible. The nomination deadline is December 1, 2017. See http://www.imstat.org/awards/tweedie.html.

Nominations are invited for the Carver Medal, created by the IMS in honor of Harry C. Carver, for exceptional service specifically to the IMS. All nominations must be received by February 1, 2018. Please visit http://www.imstat.org/awards/carver.html.

A candidate for the IMS Fellowship shall have demonstrated distinction in research in statistics or probability, by publication of independent work of merit. This qualification may be partly or wholly waived in the case of either a candidate of well-established leadership whose contributions to the field of statistics or probability other than original research shall be judged of equal value; or a candidate of well-established leadership in the application of statistics or probability, whose work has contributed greatly to the utility of and the appreciation of these areas. Candidates for fellowship should be members of IMS on December 1 of the year preceding their nomination, and should have been members of the IMS for at least two years (you can email Elyse Gustafson erg@imstat.org to check this before you start). The nomination deadline is January 31, 2018. For nomination requirements, see http://www.imstat.org/awards/fellows.htm.

Apply for Travel Awards

Applications are open for two types of travel awards. New this year is the IMS Hannan Graduate Student Travel Award, which funds travel and registration to attend (and possibly present a paper/poster at) an IMS sponsored or co-sponsored meeting. The travel awards are available to IMS members who are graduate students (seeking a Masters or PhD degree) studying some area of statistical science or probability. If you are a New Researcher (awarded your PhD in 2012–17) looking for travel funds, you should apply for the IMS New Researcher Travel Award to fund travel, and possibly other expenses, to present a paper or a poster at an IMS sponsored or co-sponsored meeting (apart from the IMS New Researcher’s Conference, which is funded separately). Applicants for both these travel awards must be members of IMS, though joining at the time of application is allowed (student membership is free, and new graduate membership discounted!). The application deadline for both is February 1, 2018. See http://www.imstat.org/awards/hannan.html and http://www.imstat.org/awards/travel.html.

Nominations still open for COPSS, Doeblin

If you’re in a nominating mood, remember there’s still time to nominate for next year’s COPSS Awards: the R.A. Fisher Award and Lectureship (deadline December 15), and the Presidents’ Award and Elizabeth L. Scott Award (both January 15, 2018).

Nominations are also open (deadline November 15) for the Bernoulli Society’s Wolfgang Doeblin Prize, awarded to an early career researcher, for outstanding research in probability theory. See http://bulletin.imstat.org/2017/07/call-for-nominations-for-the-2018-doeblin-prize/

Categories: Math and Stats

2017 Presidential Address: Teaching Statistics in the Age of Data Science

IMS Bulletin - Mon, 2017-10-02 11:11
Jon A. Wellner gave his IMS Presidential Address at the Joint Statistical Meetings in Baltimore. In it, Jon reviewed the influence of data science and machine learning on the teaching of statistics at the Graduate Level in the USA, and drew comparisons with several articles by Harold Hotelling from the 1940s.

 

1. Statistics and Data Science: Introduction

What has happened, and is happening? Many departments of statistics in the U.S. and elsewhere have initiated new MS degree programs in Data Science. Several departments (including Yale and the University of Texas at Austin) have changed their names to “Statistics and Data Science.” Many departments of statistics have created new pathways in Data Science and Machine Learning at the PhD level: for example, the University of Washington, Carnegie Mellon University, and Stanford, among others.

These changes naturally lead to questions concerning curricula and teaching in departments of statistics (and elsewhere) at all levels: undergraduate and graduate, including MS and PhD levels. Before proceeding, I should openly disclose that I have another reason for trying to address the issues in teaching raised by the changes briefly outlined above. Firstly, my department chair has asked me to review theory course offerings in the PhD program in Statistics at the UW, and recommend changes in the curriculum, if needed. Secondly, I will be teaching Statistics 581 and 582, Advanced Statistical Theory, during Fall and Winter quarters 2017–2018. So, what should I be teaching?

2. Exciting Times for Statistics and Data Science

The current excitement and attractions of statistics and data science have been propelled and provoked by:

a] increased demand for our knowledge and expertise;

b] the challenges of “big data” in terms of both computation and theory;

c] changes needed in statistical education to meet these demands and challenges.

2.1. Increasing demand. This is a “golden age” for statistics! Statisticians (and Data Scientists) are in great demand across a wide range of endeavors ranging from science to medicine and business or commercial enterprises, and with employment opportunities in academia, government, and industry.

The following table from the US Bureau of Labor Statistics puts “statistician” right at the top of job categories for which there will be increased demand for the period 2014–2024:

Projections, Bureau of Labor Statistics, 2014–24

Job Description

Increase

Statistician

34%

Mathematician

21%

Software Developer

17%

Computer & Information Research Scientist

11%

Biochemists and Biophysicist

8%

Physicists and Astronomer

7%

Chemists and Materials Scientist

3%

Computer Programmer

−8%

Note that “Data scientist” has not yet entered the list of job descriptions here, but that there is certainly some overlap with the categories “Software Developer” and “Computer and Information Research Scientist.” The increasing demand for statisticians raises a number of important questions:

Q1 Can we meet the demand?

Q2 How should we be gearing up to meet the increased demand?

Q3 What changes should we be making in the teaching of statistics to attract the best and brightest students?

Q4 What should we be teaching?

2.2. Challenges of big data. “Big data” continues to present a variety of challenges for statistics and data science: for computation and analysis, as well as for theory. On the other hand, in a lucid discussion, “big data” is dismissed by Donoho [13] as a distinction between statistics and data science. I will return to Donoho’s thought-provoking article briefly below.

2.3. Changes needed in statistical education? Meeting the increased demand for statisticians and data scientists and answering the challenges of big data, may well require further changes in degree structures and changes in curricula for existing and new degrees at all levels: high school, all college or university students, undergraduates, MS degree students, and PhD students. It may also require changes in the modes of teaching.

Johnstone’s short article [24] raises an intriguing question about student enrollment in undergraduate statistics majors, but that picture may well be changing with the increases in undergraduate study of statistics. Important developments and improvements are taking place at all these levels (see for example [16], [10], [6], [33], [5], [4]), but because of my own particular interests in graduate level teaching (and the current task from my chair), I will focus on curricula for MS and PhD programs in statistics during the remainder of this article.

There are clearly differing views within our community about the issues and challenges presented by data science for the statistics profession. Marie Davidian, said in the 2013 Report of the London Workshop on the Future of Statistical Sciences, [31]:

“I believe that the statistical sciences are at a crossroads, and that what we do currently … will have profound implications for the future state of our discipline. The advent of big data, data science, analytics, and the like requires that we as a discipline cannot sit idly by… but must be proactive in establishing both our role in and our response to the ‘data revolution’ and develop a unified set of principles that all academic units involved in research, training, and collaboration should be following. … At this point, these new concepts and names are here to stay, and it is counterproductive to spend precious energy on trying to change this. We should be expending our energy instead to promote statistics as a discipline and to clarify its critical role in any data-related activity.”

In an article [44] commenting on the London Workshop report, Terry Speed said:

“Are we doing such a bad job that we need to rename ourselves data scientists to capture the imagination of future students, collaborators, or clients? Are we so lacking in confidence … that we shiver in our shoes the moment a potential usurper appears on the scene? Or, has there really been a fundamental shift around us, so that our old clumsy ways of adapting and evolving are no longer adequate? … I think we have a great tradition and a great future, both far longer than the concentration span of funding agencies, university faculties, and foundations. … We might miss out on the millions being lavished on data science right now, but that’s no reason for us to stop trying to do the best we can at what we do best, something that is far wider and deeper than data science. As with mathematics more generally, we are in this business for the long term. Let’s not lose our nerve.”

On the other hand, Richard De Veaux, speaking at the London Workshop [31], said: “Statistics education remains mired in the twentieth (some would say the nineteenth) century.”

3. Back-tracking: History, part 1

At this point it might be helpful to review some of the history concerning the creation of departments of statistics in the US, and the organization of teaching in those departments. Harold Hotelling played a key role in this.

Harold Hotelling

Here is a brief recap of Hotelling’s career: He was born in 1895 in Minnesota. In 1904 he moved to Washington with his family. He studied journalism as an undergraduate at the University of Washington and earned a BA degree in journalism in 1919 after service in the Army during World War I interrupted his studies. Not finding journalism to his liking, Hotelling then earned an MS degree in Mathematics from the UW. During this period he was advised by Eric Temple Bell, who urged him to study economics. Hotelling gained entrance to the PhD program in Mathematics at Princeton, hoping to study mathematical economics and statistics. Finding no-one at Princeton engaged in these research directions, he turned to topology and differential geometry, earning a PhD degree under Oscar Veblen in 1924. Hotelling then joined the Food Research Institute at Stanford University and became associated with its Mathematics Department, where he served as an Assistant Professor during the period 1927–31. During this period he worked on both economics and statistics, and spent 6 months with R. A. Fisher at Rothamsted in 1929. In 1931 Hotelling moved to the Department of Economics at Columbia University, where he began attracting graduate students and other faculty interested in statistics, and he became involved in the early period of the IMS, serving as the sixth President of the IMS in 1941. During World War II Hotelling was deeply involved in the Statistical Research Group at Columbia, which played a key role in providing statistical advice to the US government and military. After failing to convince Columbia University to form a separate Department of Statistics, in 1946 Hotelling moved to the University of North Carolina at Chapel Hill, where he was able to create such a department. He remained at UNC Chapel Hill until his death in 1973. For further information concerning Hotelling and his work, see [41], [11], [2] and [47].

But the work of Hotelling which concerns us here is his 1940 paper [18] on the “Teaching of Statistics”. This was presented as an invited talk at a meeting of the IMS held in Hannover, New Hampshire. This could be viewed as a preliminary position paper of a committee formed by the IMS to examine the teaching of statistics. In his paper and talk Hotelling laid out the two difficulties involved in the teaching of statistics as of 1940, i.e., failure to recognize statistics as a science requiring specialists to teach it, and a shortage of qualified instructors.

Hotelling’s talk and paper strongly influenced Jerzy Neyman. Ingram Olkin [35] noted that Hotelling’s 1940 paper on the teaching of statistics “had a phenomenal impact. Jerzy Neyman stated that it was one of the most influential papers in statistics. Faculty attempting to convince university administrators to form a Department of Statistics often used this paper as an argument why the teaching of statistics should be done by statisticians and not by faculty in substantive fields that use statistics.”

On the other hand, the discussion of Hotelling’s paper by W.E. Deming raised issues relevant for applications: “Above all, a statistician must be a scientist. A scientist does not neglect any pertinent information.”

Hotelling authored at least two other works ([21], [19]) on the teaching of statistics. The 1948 Annals article was a report of the IMS Committee on the Teaching of Statistics with Hotelling as the chair and Walter Bartky, W. E. Deming, M. Friedman, and P. Hoel as further committee members. Part I of the article was presented as the consensus of the committee, was relatively brief and addressed the following questions:

(1) Who are the prospective students of statistics?

  (a) All college (university) students.

  (b) Future consumers of statistics.

  (c) Future users of statistical methods.

  (d) Future producers and teachers of statistical methods.

(2) What should they be taught?

(3) Who should teach statistics?

(4) How should the teaching of statistics be organized?

(5) What should be done about adult education?

The longer Part II of the article, The Place of Statistics in the University, was written by Hotelling and reflected his views. His major points are summarized:

A. Minor nuisances and inefficiencies in statistical teaching (Lack of coordination among departments; Lack of advanced courses and laboratory facilities; Inefficient decentralization of libraries)

B. The major evil: failure to recognize the statistical method as a science, requiring specialists to teach it (Too many teachers not specialists; Results: students ill equipped; Reasons why teachers of statistics are often not specialists: the rapid growth of the subject, confusion between the statistical method and applied statistics, failure to recognize the need for continuing research, and the system of making appointments to teach statistics within particular departments that are devoted primarily to other subjects; Appointments under the existing system are not all bad; Unsatisfactory texts; Omission of probability theory from texts and teaching)

C. Proper qualifications of teachers of statistics (Statistics compared with other subjects; Current research in the statistical method is essential for teachers; Minimum requirements in mathematics for the training of teachers and research men [sic] in statistical theory)

D. Need for relating theory with applied statistics (An example of the interaction between theory and practice; Supplying opportunities for application in graduate studies of statistics)

E. Recommendations on the organization of statistical teaching and research in institutions of higher learning (Research should be encouraged; teaching schedules should not be overloaded; Organization of statistical service in the university; Organization for teaching; The statistical curriculum; Statistical method as part of a liberal education).

The 1940 paper [18] and the 1948 committee report [21] were reprinted in Statistical Science in 1988 [20], followed by discussion pieces by D.S. Moore, J.V. Zidek, K.J. Arrow, H. Hotelling Jr., Ralph Bradley, W.E. Deming, S.S. Gupta, and I. Olkin. The discussion pieces reflected the long-standing (and creative) tensions between the influence of mathematics on statistical theory on the one side, and applications/data analysis on the other. Here, I will simply note Shanti S. Gupta’s view of Hotelling’s papers: “He rightly visualized the academic statistician as a tool-maker who ‘must not put all his time on using the tools he makes’, but must focus his/her attention on the tools themselves.”

Hotelling [18] had expressed the balancing act as follows: “Statistical theory is a big enough thing in itself to absorb the full-time attention of a specialist teaching it, without his going out into applications too freely. Some attention to applications is indeed valuable, and perhaps even indispensable as a stage in the training of a teacher of statistics and as a continuing interest. But particular applications should not dominate the teaching of the fundamental science, any more than particular diseases should dominate the teaching of anatomy and bacteriology to pre-medical students.”

In a review of the 1948 Hotelling Committee report and a similar report on the teaching of statistics by a committee of the Royal Statistical Society [25], Truman L. Kelley (Professor of Education at Harvard University) wrote: “It seems to the reviewer that there is implicit in the British recommendation an induction of the student into statistics via the subject matter of his field of specialization, and in the American an induction via logic, including principles of mathematics and probability. It is needless to say that these approaches are far asunder.”

These two quotes are a small sample of the long-running tensions within statistics and statistics education. In my view, these tensions are an inherent part of the process of creating new statistical methods and perspectives. Kelley [25] continued:

“The American committee, by omission and by inclusion, reveals what it considers to be preparatory background for students of statistics. It at no point cites knowledge of data in some scientific field as essential. … The American committee deplores the general lack of mathematical competence of most teachers of statistics in different subject matter fields. This is deplorable as is their lack of knowledge of the genius of data in their fields. However, the progress of recent decades should make one optimistic, and these two committee reports should encourage college presidents to strengthen and broaden the instruction in both mathematical and applied statistics.”

4. Back-tracking: History, part 2

A second set of important developments:

• In his 1962 paper, “The future of data analysis” [46], John Tukey called for a revamping of academic statistics, and pointed to a new science focused on data analysis.

• John Chambers (1993, [7]) and Bill Cleveland (2001, [9]) developed Tukey’s ideas further.

• Leo Breiman’s (2001) “Two cultures…” paper [3] clearly delineated the differing approaches to data analysis which developed in the years since Tukey (1962): Predictive modeling; Common Task Framework, and Generative modeling; Inference

• Donoho (2015), “50 years of Data Science” [13], gives a guide to this history, explains the key role of the Common Task Framework, and provides an updated road map to what he calls Greater Data Science.

Now we fast-forward to 2002–2004. By the beginning of the 21st century the era of data science, “big data”, and machine learning was well underway. Breiman’s (2001) paper [3] clearly delineated the differences in approaches to data analysis which had developed in the years since Tukey (1962) [46]. In May 2002, the NSF hosted a workshop on future challenges and opportunities for the statistics community. The resulting “Report on the Future of Statistics” by Bruce Lindsay, Jon Kettenring and David O. Siegmund (2004, [30]):

• addressed features of the statistical enterprise relevant to the NSF;

• did not include biostatistics;

• did not explicitly address teaching of statistics, but alluded to teaching indirectly through “manpower” problems;

• identified opportunities and needs for the “core of statistics”.

As noted in the report:

“If there is exponential growth in data collected and in the need for data analysis, why is “core research” relevant? … Because unifying ideas can tame this growth, and the core area of statistics is the one place where these ideas can happen and be communicated throughout science.”

Of course, there have been big changes both in statistics and in the world of science in general since Hotelling’s time, and even since the Lindsay-Kettenring-Siegmund report of 2004. Here is an oversimplified summary, making comparisons between 1940 and now (or 2015):

“Then” (1940)

“Now” (2015)

# of statistics departments

5–10

≈60

# of biostatistics departments

≈1

≈43

# of graduate students, statistics

< 50?

4597 (24%)

# of graduate students, biostatistics

< 10?

1960 (14%)

IMS membership

< 100?

≈3500

Computer clock speed

5–10Hz Zuse Z3 (1941)

>2.7GHz Mac

Powerbook

Terminology/ Department names

Mathematical Statistics,
Applied Statistics

Statistics, Data Analysis,
Data Science

This table shows that one clear outcome of Hotelling’s papers [18], [21] and [19] has been the establishment of separate departments of statistics in the US. Now we are in the midst of considerable (remarkable? large? exponential?) increases in enrollment in the courses offered by these departments. The following two graphs (by Steve Pierson [38]) show the growth in Bachelor’s, Master’s, and PhDs in Statistics and Biostatistics combined and in Biostatistics (separately) over the period 1987–2015 [in the US]:

Note the much slower growth in PhD degrees versus degrees at the Masters and Bachelors levels. Comparing these curves gives some pause for reflection!

5. MS curricula in statistics

But what about the curricula of the new Data Science and Machine Learning programs? For example, what is the curriculum of one of the typical new Data Science (DS) MS degree programs?

Donoho (2015) section 7 reviews a typical such curriculum (at UC Berkeley). There the core of the MS Data Science curriculum includes: Research Design and Application for Data and Analysis; Exploring and Analyzing Data; Storing and Retrieving Data; Applied Machine Learning; and Data Visualization and Communication. The advanced courses include: Experiments and Causal Inference; Applied Regression and Time Series Analysis; Legal, Policy, and Ethical Considerations for Data Scientists; Machine Learning at Scale; Scaling up! Really big data; and a Capstone course (with data analysis project). The program at Berkeley is run by the Information School.

At my home, the University of Washington, the DS MS program is run by the E-Science Institute (with co-operation) from Statistics, CS, and Biostatistics. The curriculum includes: Introduction to Statistics and Probability; Data Visualization & Exploratory Analytics; Applied Statistics and Experimental Design; Data Management for Data Science; Statistical Machine Learning for Data Scientists; Software Design for Data Science; Scalable Data Systems and Algorithms; Human-Centered Data Science; and Data Science Capstone Project

There is clear overlap in both lists with courses offered in a traditional statistics MS program, but with a number of substitutions from a Computer Science MS program. Ten American MS Programs in Data Science and Analytics were surveyed in Amstat News articles in April and June 2017. Each of these surveys included the following query to one of the principal organizers or instructors for the program: “Do you have any advice for institutions considering the establishment of such a degree?”

One of the responses, which seemed very thoughtful and relevant, was from Mark Craven, Univ of Wisconsin–Madison: “I would advise any institution considering this area to build on existing partnerships between statistics, biostatistics, computer sciences, and biomedical informatics. No one unit can or should ‘own’ this area, so proceeding in a broad and inclusive way makes the most sense.”

Donoho [13] gives an analysis of the Berkeley Data Science curriculum in the context of Tukey’s critiques and writings. He writes: “Although my heroes, Tukey, Chambers, Cleveland and Breiman, would recognize positive features in these programs, it’s difficult to say whether they would approve of their long-term direction —or if there is even a long-term direction to comment about: … Data Science Masters curricula are compromises: taking some material out of a Statistics masters program to make room for large database training; or, equally, taking some material out of a database masters in CS and inserting some statistics and machine learning. Such a compromise helps administrators to quickly get a degree program going, without providing any guidance about the long-term direction of the program and about the research which its faculty will pursue. What long-term guidance could my heroes have provided?’’

6. PhD curricula in statistics

At the University of Washington, the PhD program has four possible tracks: Normal/ Basic track; Statistical genetics; Statistics for the Social Sciences; and Machine Learning (ML) and Big Data (BD). The following table shows PhD student numbers in each of these tracks at the University of Washington over the period 2001-2016:

track

Graduated

Current

Total

Normal, Stat

83

37

120

Normal, Biost

103

49

152

StatGen, Stat

13

3

16

StatGen, Biost

5

3

8

Stat in Soc Sci

5

1

6

ML-BD

1

13

14

total, Stat

102

54

156

total, Biostat

108

52

160

From this table (and especially the 13 PhD students currently enrolled in the ML-BD track) it is clear that the ML-BD track is proving to attract a substantial number of our current PhD students. This makes re-consideration of the curriculum in statistical theory (and methodology) increasingly important. What should I be teaching in Statistics 581-582 during the coming two quarters? What would my heroes recommend if they were here to offer their wise advice?

My heroes are different than David Donoho’s; they include: H. Chernoff, J.L. Doob, R.A. Fisher, Jaroslav Hajek, Wassily Hoeffding, Harold Hotelling, Jack Kiefer, Lucien Le Cam, Charles Stein and Abraham Wald. My sense is that future research directions, including manifold learning, topological data analysis, and statistical methods to deal with nonstandard data types (functions, trees, images, etc.) will require more mathematics and more probability rather than less.

7. Back to my problem: what to teach in the theory course sequence?

Here is a brief outline of UW’s current Statistics 581, 582 and 583 courses.

Outline for Stat 581:

Inequalities; basic asymptotic theory in statistics. Examples: robustness (or lack of robustness) of normal theory tests; chi-square statistic and power of chi-square tests under fixed and local alternatives; limit theory for fixed dimension linear regression; limit theory for correlation coefficients; limit theory for empirical distributions and sample quantiles; examples from survival analysis/censored data.

Lower bounds for estimation. Multi-parameter Cramér–Rao lower bounds; super-efficiency & introduction to Hajek–Le Cam convolution theorem and local asymptotic minimax theorems; simple lower bound lemma via two point inequalities.

Classical (and nonparametric) maximum likelihood: Existence; empirical d.f. & empirical measure as MLEs; algorithms, one step approximations, and EM; LR, Wald, and Rao tests: fixed and local alternatives; Brief introduction to agnostic viewpoint: what if the model fails?

Outline for Stat 582:

Elementary Decision Theory: Bayes rules, minimax rules, and connections.

• Bayes theory, inadmissibility, and empirical Bayes.

• Optimal tests and tests optimal in subclasses: eliminating nuisance parameters by conditioning and invariance.

Outline for Stat 583:

Parameters as functionals and the Delta Method. Continuity: probability metrics and properties of functionals; differentiability of functionals: Fréchet, Hadamard, and Gateaux; examples and applications.

Resampling methods. General approach to bootstrap and resampling methods; Jack-knife; Bootstrap methods; examples and applications; Bootstrap and the delta method.

New in Statistics 581 – 582 during this coming year?

• Large scale hypothesis testing and FDR’s?

• More on empirical Bayes?

• More on convexity?

• More on empirical process theory and use of inequalities?

• …??

What should be reduced or deleted? I don’t know exactly yet, but I’m working on it… and on the report to my chair.

Let me close with a couple of excerpts from Efron and Hastie [14]. From the Preface, page xvii: “Useful disciplines that serve a wide variety of demanding clients run the risk of losing their center. Statistics has managed, for the most part, to maintain its philosophical cohesion despite a rising curve of outside demand. The center of the field has … moved in the past sixty years, from its traditional home in mathematics and logic toward a more computational focus.” And from the Epilogue, page 447: “It is the job of statistical inference (theory) to connect ‘dangling algorithms’ to the central core of well-understood methodology. The connection process is already underway.”

Here, then, is a brief summary of my views:

Embrace and encourage data science!

• Continue evolving the curriculum to teach the unifying themes of statistical research.

• Keep doing what statisticians do best: question, question, question… and then provide the best answers possible, based on the available data.

Attract the best and brightest students to research work in statistics.

Teach what we know!

I’d like to close with a personal aside: H. Hotelling played a significant role in my own involvement in statistics as a career. As is clear from the Statistical Science “Conversation with Z.W. Birnbaum” [32], Hotelling pointed (Bill) Birnbaum toward a position at the University of Washington in 1939. Bill created a strong environment for statistics within the UW Math Department during the period 1940–1970, and his students Ronald Pyke, Albert Marshall, and the people they attracted to the UW became my teachers and mentors when I arrived at the UW in 1971.

References

[1] A. Agresti and X.-L. Meng, editors. Strength in Numbers: the Rising of Academic Statistics Departments in the US. Springer, New York, 2013.

[2] K.J. Arrow and E.L. Lehmann. Harold Hotelling, September 29, 1895–December 26, 1973. Biographical Memoirs, National Academy of Sciences, 87:220–233, 2006. http://www.nap.edu/catalog11522.html.

[3] L. Breiman. Statistical modeling: the two cultures. Statist. Sci., 16(3):199–231, 2001. With comments and a rejoinder by the author.

[4] E.N. Brown and R.E. Kass. Rejoinder [mr2750071; mr2759682; mr2759683; mr2759684; mr2759685; mr2759686]. Amer. Statist., 63(2):122–123, 2009.

[5] E.N. Brown and R.E. Kass. What is statistics? Amer. Statist., 63(2):105–111, 2009.

[6] N. Chamandy, O. Muralidharan, S. Wager. Teaching statistics at Google-scale. Amer. Statist, 69(4):283–291, 2015

[7] J.M. Chambers. Greater or lesser statistics: a choice for future research. Statistics and Computing, 3:182–184.

[8] R. Chellappa. Mathematical statistics and computer vision. Image and Vision Computing, 30:467–468, 2012.

[9] W.S. Cleveland. Data science: an action plan for expanding the technical areas of the field of statistics. International Statistical Review, 69:21–26, 2001.

[10] G.W. Cobb. Teaching statistics: some important tensions. Chil. J. Stat., 2(1):31–62, 2011.

[11] A. C. Darnell. Harold Hotelling 1895-1973. Statistical Science, 3:57–62, 1978.

[12] R. A. Davis. IMS Presidential Address: Are we meeting the challenge? IMS Bulletin, 45(7), Oct/Nov 2016.

[13] D. Donoho. 50 years of data science. Technical report, Dept of Statistics, Stanford University, 2015. presented at the Tukey Centennial Workshop, Princeton U.; http://courses.csail.mit.edu/18.337/2015/docs/50YearsDataScience.pdf.

[14] B. Efron and T. Hastie. Computer Age Statistical Inference, volume 5 of IMS Monographs. Cambridge University Press, New York, 2016.

[15] P. Hall. We live in exciting times. In Past, Present, and Future of Statistical Science, pages 157–169. Chapman and Hall, Boca Raton, FL, 2015.

[16] N. J. Horton and J. S. Hardin. Teaching the next generation of statistics students to “think with data”: special issue on statistics and the undergraduate curriculum [Guest editorial]. Amer. Statist., 69(4):259–265, 2015.

[17] H. Hotelling. The selection of variates for use in prediction with some comments on the general problem of nuisance parameters. Ann. Math. Statist., 11:271–283, 1940.

[18] H. Hotelling. The teaching of statistics. Ann. Math. Statist., 11:457–470, 1940.

[19] H. Hotelling. The place of statistics in the university. In Proc. First Berkeley Sympos. Math. Statist. & Probability (Berkeley, Calif., 1949), Vol. I: Statistics, pages 21–40. Univ. of California Press, Berkeley, Calif., 1949.

[20] H. Hotelling. Golden oldies: Classic articles from the world of statistics and probability: The teaching of statistics. Statist. Sci., 3:63–71, 1988.

[21] H. Hotelling, W. Bartky, W. Deming, M. Friedman, and P. Hoel. The teaching of statistics. Ann. Math. Statist., 19:95–97, 1948.

[22] P. J. Huber. The behavior of maximum likelihood estimates under nonstandard conditions. In Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), Vol. I: Statistics, pp 221–233. Univ. California Press, Berkeley, Calif., 1967.

[23] P. J. Huber. Data Analysis. John Wiley & Sons, Inc., Hoboken, NJ, 2011.

[24] I. Johnstone. Where are the majors? In Past, Present, and Future of Statistical Science, pp 153–156. Chapman and Hall, Boca Raton, FL, 2015.

[25] T. L. Kelley. Review of: “The Teaching of Statistics: A Report of the Institute of Mathematical Statistics” and “The Teaching of Statistics in Universities and Colleges. J. Amer. Statist. Assoc., 43:493– 496, 1948.

[26] M. R. Kosorok. Rejoinder to discussion of “What’s so special about semiparametric methods?” [mr2639296; mr2639297; mr2639298]. Sankhyā, 71(2, Ser. A):369–371, 2009.

[27] M. R. Kosorok. What’s so special about semiparametric methods? Sankhyā, 71(2, Ser. A):331–353, 2009.

[28] M. Liberman. Reproducible research and the common task method. Technical report, Simons Foundation Frontiers of Data Science Lecture, 2015. video available.

[29] X. Lin, C. Genest, D. L. Banks, G. Molenberghs, D. W. Scott, and J.-L. Wang, editors. Past, Present, and Future of Statistical Science. Chapman and Hall, Boca Raton, FL, 2015.

[30] B. G. Lindsay, J. Kettenring, and D. O. Siegmund. A report on the future of statistics. Statist. Sci., 19(3):387–413, 2004. With comments.

[31] D. Mackenzie, D. Madigan, and R. Wasserstein. Statistics and science; A Report of the London Workshop on the Future of the Statistical Sciences. 2013.

[32] A. W. Marshall. A conversation with Z. William Birnbaum. Statist. Sci., 5(2):227–241, 1990.

[33] XL Meng. Desired & feared: what do we do now and over the next 50 years? Amer. Statist, 63(3):202–210, 2009.

[34] J. Neyman. First Course in Probability and Statistics. Henry Holt and Co., New York, N. Y., 1950.

[35] I. Olkin. Reminiscences of the Columbia University Department of Mathematical Statistics in the late 1940s. In Past, Present, and Future of Statistical Science, pp 23–28. Chapman and Hall, Boca Raton, FL, 2015.

[36] I. Olkin and A. R. Sampson. Harold Hotelling. In C. C. Heyde, E. Seneta, P. Crepel, S. E. Fienberg, and J. Gani, editors, Statisticians of the Centuries, pp 454–458. Springer New York, New York, NY, 2001.

[37] I. Olkin and A. R. Sampson. Hotelling, Harold (1895–1973). In International Encyclopedia of the Social and Behavioral Sciences, pages 6921–6925. 2001.

[38] S. Pierson. Science policy: Statistics, biostatistics degree growth sustained through 2015. Amstat News, 472:16–19, 2016.

[39] S. Pierson. Master’s programs in data science and analytics. Amstat News, 478:22–27, 2017.

[40] S. Pierson. Master’s programs in data science and analytics (continued … ). Amstat News, 480:15–24, 2017.

[41] W. L. Smith. Harold Hotelling 1895 – 1973. Ann. Statist., 6(6):1172–1183 (1 plate), 1978.

[42] W. L. Smith. Harold Hotelling, 1895-1973. Ann. Statist., 6:1172 –1183, 1978.

[43] T. Speed. Hotelling, and the teaching of statistics, Hotelling lecture # 1. University of California, Berkeley, 2007.

[44] T. Speed. Trilobites and us. Amstat News, January 2014, 439(1):9–10, 2014.

[45] S. M. Stigler. The Seven Pillars of Statistical Wisdom. Harvard University Press, Cambridge, MA, 2016.

[46] J. W. Tukey. The future of data analysis. Ann. Math. Statist., 33:1–67, 1962.

[47] Wikipedia. Harold Hotelling. 2017. https://en.wikipedia.org/wiki/Harold_Hotelling.

[48] B. Yu. IMS Presidential address: Let us own data science. IMS Bulletin, 43(7), Oct/Nov 2014.

Categories: Math and Stats

Obituary: Ken-ichi Yoshihara, 1932–2016

IMS Bulletin - Mon, 2017-10-02 10:27

Ken-ichi Yoshihara and his wife Yasuko, at Texas A&M University in Kingsville, 2012

Professor Ken-ichi Yoshihara died in Yokohama, Japan, on October 29, 2016. He was born on September 20, 1932 in Zushi, a small town near Yokohama. He graduated with a BA from Yokohama National University in 1954; and received a master’s degree (1956) and PhD (1965) from the Tokyo University of Education (later Tsukuba University). He held faculty appointments at Yokohama National University in Yokohama (1963–97) and at Soka University in Hachioji (1997–2007). He was awarded the Japanese government’s Medal of Honor with Purple Ribbon in 2011, for his outstanding contributions in mathematics and in education.

Professor Yoshihara was a pioneer of probability theory and statistics in the field of analysis of weakly dependent random variables. He established a breakthrough method to approximate a sequence of dependent random variables satisfying some mixing conditions by a sequence of independent random variables constructed carefully according to the joint distributions of the original sequence. He estimated the error terms very accurately and obtained the almost best possible error evaluation in the approximation (see [1]). He especially studied dependent random variables with the absolutely regular mixing condition. The absolutely regular mixing condition satisfies the ϕ-mixing condition and the strong mixing condition satisfies the absolute regular mixing condition (see [1], [2], [3] and [5]).

In the field of time series analysis, we investigate its property according to the equation of its modeling. For example, in the autoregressive (AR) model, the random variable at time t is defined by sums of time series with some weights defined before t and some noises. Since each random variable at time t can be written by sums of infinitely many random noises, we need very complicated calculations for such research. On the other hand, from the viewpoint of mixing properties, a large family of linear models of time series, like AR models, satisfies some mixing conditions. Therefore Yoshihara’s approximation method for random variables with mixing conditions is very useful in time series analysis. (See [5].)

Using his approximation method, he extended some limit theorems such as the central limit theorem and law of large numbers for independent random variables to weakly dependent random variables. In particular, he paid attention to symmetric statistics like U-statistics and V-statistics, and showed the asymptotic normality of such statistics for dependent random variables satisfying some mixing condition. (See [1], [4] and [5].)

He also developed the theory of extreme value statistics for weakly dependent random variables. Recently, the rise in the risk of natural disasters due to climate change has been causing concern. Since extreme value statistics is deeply involved with such risk analysis, it is increasingly important. Originally, extreme value statistics had been investigated for independent random variables. As mentioned previously, lots of time series described by some linear models satisfy mixing conditions. Therefore extreme value statistics can be applied to time series by Yoshihara’s approximation method, and has improved its availability. (See [5].)

In [5], Yoshihara collected recent developments of analysis for stochastic sequences of weakly dependent random variables in probability theory and statistics into a significant and substantial 15 volumes.

Finally, I mention Professor Yoshihara’s interest in education, not only for university students but also for high school students. He wrote some textbooks of mathematics for high school students, which were approved by Japan’s Ministry of Education.

Shuya Kanagawa, Tokyo City University

References:

[1] Yoshihara, K. (1976) Limiting behavior of U-statistics for stationary absolutely regular processes. Z. Wahrsch. Verw. Gebiete 35: 237–252.

[2] Yoshihara, K. (1978) Limiting behavior of one-sample rank order statistics for absolutely regular processes. Z. Wahrsch. Verw. Gebiete 43: 101–127.

[3] Yoshihara, K. (1978) Probability inequalities for sums of stationary absolutely regular processes and their applications. Z. Wahrsch. Verw. Gebiete 43: 319–329.

[4] Kanagawa. S. and Yoshihara, K. (1994) The almost sure invariance principles of degenerate U-statistics of degree two for stationary random variables. Stoch. Proc.Appl. 43: 347–356.

[5] Yoshihara, K., Weakly Dependent Stoch-astic Sequences and their Applications. Vol.1–15 (1992–2005) Sanseido, Tokyo.

Categories: Math and Stats

Pro Bono Statistics: Learning as the replication of knowledge

IMS Bulletin - Mon, 2017-10-02 10:23
Yoram Gat writes in his second column:

I remember a few scattered comments by professors, which I heard or overheard as a graduate student and which gave a glimpse into the professors’ insights about learning. Maybe those comments stuck because they addressed a topic which I have so rarely heard discussed.

A major declared goal of the educational system, and a non-negligible part of its function in practice, is to have people learn. It is therefore quite surprising that consideration of the process of learning itself is largely absent from the curriculum. Throughout the years I spent at school and in higher education institutions I cannot remember a single lesson devoted to how learning occurs and what determines its success, let alone a deeper, more systematic treatment. It may very well be that analysis of learning is part of the higher education curriculum in certain departments, but with learning being a central occupation of schools in general, its consideration would be expected not be limited to specialized fields but rather given a prominent place at school, from a very young age.

It seems to me that the strange silence about learning is a reflection of, as well as a reason for, the perpetuation of a certain model of learning that is implicitly conventionally assumed. According to this model, learning is a process of reproducing and assembling standard units of knowledge. Working from an existing blueprint and using a standardized process, the units of knowledge can be produced and then assembled to create a knowledge structure, much like interlocking machine parts are produced and assembled into a machine. Teachers have some of those knowledge pieces in their minds. A competent teacher can describe those pieces and their proper relative positions to the students. Any competent and attentive student can use the description to create a copy of the pieces in their mind and assemble them together with the pieces of knowledge already present there. Once this has been done the student has learned (although some homework may be useful for oiling the gears of the newly created mental machinery, particularly for less-than-brilliant students). If learning is such a straightforward process of replication, there may be nothing to discuss. Each student is characterized by an individual academic ability which may be conceived of as a one-dimensional parameter. This parameter determines the ease and speed at which the student can replicate and assemble new units of knowledge which are presented to them. This ability—“intelligence”—is essentially innate and opaque, so rather than wondering about how learning works, the model leads towards attempting to operationalize the model’s parameter and to measure its magnitude for each student.

This schematic description of a learning model is necessarily a strawman and any experienced student or teacher would likely have various reservations about accepting it. Yet it seems to me that it is essentially this model that dominates the way learning is perceived and handled in society. If it is hard to find explicit endorsements of the learning-as-replication model, this is not because other models are used or even entertained, but, on the contrary, because it is taken to be too obvious to admit any alternatives. School is obviously about students acquiring the knowledge their teachers have. Could teaching be anything other than a process of piecemeal replication of the knowledge machinery?

In various ways, society embodies the replication model as well as encourages its internalization by both students and teachers. Learning via a standardized process is put center-stage by an emphasis on class attendance and by minimal provision of interactive individual student-teacher sessions. The material taught is presented as objective and authoritative and the teachers are assumed to know all there is to know about it, so scholastic achievement is operationalized as the ability to imitate established patterns (often essentially verbatim), with little expectation for creative, let alone diverging, expression. The future of education is discussed in terms of larger scale, more efficient knowledge replication via mass media channels such as MOOCs (“the democratization of knowledge”). The educational system is busy measuring and classifying students along an axis of scholastic achievement which is taken to be a reflection of an objective ability. The system then reports the achievement measurements and classifications to interested parties. Students are driven to fulfill their inherent potential by a system of punishments and rewards which are meted out to students according to their quantile on the achievement distribution. Works of fiction and nonfiction endlessly celebrate the genius, that legendary outlier who is able to absorb knowledge quickly and effortlessly.

Thus the replication model of learning is so deeply embedded in society it seems inevitable. But the test of a learning model, like that of any model, is in how well it serves its users. Are students, and society in general, well served by the replication model? Does this model capture the important features of learning? If not, what parts of it need to be reconsidered, and what are the practical implications for society—in particular for students, teachers and managers of the educational system—of changing the model?

Yoram would be happy to have a critical and skeptical conversation about the topics he discusses in this column. He invites readers to comment below, or you can email us at bulletin@imstat.org.

 

Categories: Math and Stats

Learning Sessions: more work, less shop?

IMS Bulletin - Mon, 2017-10-02 10:21
Jan Swart is a research fellow at the Institute of Information Theory and Automatization, in the Academy of Sciences of the Czech Republic, Prague. He writes to share his experiences of co-organizing Learning Sessions, trialling a new format for sharing knowledge:

 

In four columns published in 2014 and 2015 in the IMS Bulletin [here, here, here and here], Vlada Limic made a case for organizing events different from the familiar mathematical workshops (which are, in effect, small conferences). She proposed something closer to workshops in certain other disciplines (like classical guitar playing): events where the focus is more on work—meaning getting your hands dirty, becoming seriously involved with new material, trying to learn something new—and less on the “shop” part of workshop, i.e., the familiar show-off of whom you are working with, what you are working on, what your results are, with a hint of the techniques involved.

When Rongfeng Sun, Matthias Birkner and I got the chance to organize a month of activities at the Institute for Mathematical Sciences in Singapore, which was held in July–August this year, we took Vlada’s blog as an inspiration to split the central two weeks in two halves, with a usual workshop in the second week (August 7–11), and something we called Learning Sessions in the first week (July 31–August 4). This was not quite the format suggested in Vlada’s blog from Nov 17, 2014, although it was loosely inspired by it. Since it may be of wider interest to see how this worked out, let me describe our experiences here.

Before I embark on this, it is probably fair to say that although, in view of various constraints, workshop talks often manage to convey only a vague idea of the mathematics involved, this is not true in general. As a whole, I have learned a lot from listening to talks. There have even been memorable talks where I walked away from with the feeling that I could immediately start working on a topic that just an hour earlier had been completely unknown to me. In some cases, I did. Such talks are rare, but since we have only just so much energy and time to write a few papers per year, at most, we don’t need many such talks.

Also, there already exist events with other formats than workshops, that are more aimed at teaching and learning: summer schools, mini-courses, and the traditional Arbeitsgemeinschaft that is held twice each year in Oberwolfach, to name a few. Partly inspired by these, and partly by Vlada’s ideas, we came up with the following format for our Learning Sessions. We envisaged nine sessions, each lasting half a day. Topics could be both classic material and new, cutting-edge results. Half a year before the sessions, we drew up a list of possible topics based on suggestions from the participants and from ourselves. We then asked participants which sessions they would like to present or participate in, and organized the nine most named themes.

To each session, based on the preferences of the participants, we assigned two or three “moderators,” whose task it was to prepare the material to be studied and then, during the session, to give an introduction to the topic and bring up questions to be discussed. As a rule, we did not allow moderators to be authors of the papers to be discussed. Other than that, the level of expertise varied: some moderators had been familiar with their topic for many years, while others had to learn something completely new.

In certain respects, the Learning Sessions turned out different from how we expected. We had suggested that participants should not attend all sessions, but focus on up to four sessions of their choice. In the end, though, most participants went to most sessions. This may have been be partially due to the fact that, although our format did not guarantee that this would happen, we were lucky that the chosen topics formed a coherent whole, with many cross connections between different topics.

In the months before the Learning Sessions, there was some discussion about the amount of preparation that could (or should) be expected from the participants. In the end, we decided not to put pressure on them, except for recommending some preparation, and also did not require participants to register for sessions of their interest. A quick, informal survey afterwards suggests that most participants did not invest much time preparing for sessions they intended to attend. This was probably also due to the fact that for most sessions, there was little material available that could be studied beforehand, except for a list of articles. In the end, two sessions created lecture notes, but these were available only a week or so before the start of the sessions. In addition, for one session, two volunteer participants were appointed in advance who prepared short presentations on chosen topics.

Our original idea was that each session would consist of an introduction by the two or three moderators, followed by a structured discussion moderated by those same, as the name suggests, moderators. In the end, there was very little discussion, probably due to a combination of factors:

1. It is hard to think of good themes to discuss in a group, in the limited time span of, say, an hour.

2. Since in the end most participants went to most sessions, the number of attendees at each session was quite large.

3. Many moderators found out that in order to present the material in a way that went a bit deeper, they needed (almost) all their allocated time, which was 160 minutes.

There was some good news: in spite of turning out somewhat differently than expected, it seems fair to say that most participants agreed the Learning Sessions were, yes, quite a success. So what worked well, and what did they achieve?

First of all, it seems people really learned, and learned a lot. The moderators, who all really put a lot of effort into their presentations, in fact themselves learned a lot from this; especially those who presented a topic that was new to them. The presentations, approximately two and a half hours long, really managed to delve deeper into the material than an ordinary workshop talk. In addition, they were not hindered by the need to quickly go over a lot of “well-known” results, that may not be so well-known to the audience, in order to come to the (often rather specialized) new parts the speaker has added.

Also, the Learning Sessions gave plenty of opportunities for interaction:

1. There was interaction between the moderators, who sometimes had never worked together before.

2. Interaction between different sessions, revealing new connections, which in at least one case (mine) led to a new project and a new collaboration.

3. As a co-moderator of my session, I also interacted with one of the authors of the articles under discussion, in the form of email and Skype discussions.

I now come to a more speculative point, which I nevertheless want to make: I believe, based on the points above, and also on the feedback of many participants, that the format of the Learning Sessions is more effective than a usual workshop when the aim is to inspire new research and start new collaborations. Of course, there are only so many projects a person can be involved in, and apart from initiating new projects, finishing them is also important and usually more time-consuming. Nevertheless, for the often-stated aim of stimulating new research, it may be worth considering the format of the Learning Sessions, or something in the same spirit.

Compared to summer schools and mini-courses, our Learning Sessions were shorter, allowing for more diversity, while compared to workshop talks they were still long enough to allow in-depth coverage. An unusual feature (which, however, is similar to the German Arbeitsgemeinschaft) was that speakers were not authors of the results presented. This has a number of potential advantages:

1. Non-authors can potentially offer a more fresh look on a subject, colored by their own experiences, and may have more feeling for the difficulties a beginner may encounter when trying to master a new topic.

2. This set-up can lead more easily to interaction between moderators and authors, and between one moderator and another.

3. The moderators who prepare a session potentially learn a lot themselves.

On the other hand, newcomers to a subject may have trouble getting to the core of matters, and even occasionally misrepresent or misunderstand part of the articles they are meant to explain. However, if this happens, does this not also point to the fact that not all articles are equally good at getting their message across, and hence strengthen the case for getting more people involved in spreading new knowledge?

Time will tell if our Learning Sessions will be a one-off experiment, or part of a larger move to find new ways of sharing new mathematical developments. For those who are interested in trying something similar, based on our experiences, we can offer the following bits of advice:

1. It seemed that sessions were especially successful if the moderators already had some, though perhaps not too much, prior experience with the subject.

2. It is worth thinking at an early stage about how much preparation can be expected from participants and what kind of material, if any, should be made available to them by the moderators for this aim.

3. If some sort of preparation is required, then it may be good to set deadlines to the moderators for when the preparatory material should be made available.

4. We probably profited from the fact that there is a functioning community in our sub-field of probability so that people trust each other and are (sometimes after a bit of nudging) willing to put in work for the community. Organizing a similar event with complete strangers may be harder.

And finally, a last point, that may be obvious but is still important:

5. We recommend that you should feel free to experiment and try something new. In our experience, it is fun to do and the result can be very rewarding!

Categories: Math and Stats

Meeting report: Senegal course on Records Theory

IMS Bulletin - Mon, 2017-10-02 09:12

Senegal course on Records Theory: foundation, estimation, prediction and characterization

The Department of Applied Mathematics and LERSTAD (Laboratoire d’Études et de Recherches en Statistiques et Développement) at the Université Gaston Berger, Saint-Louis, Senegal) hosted an invited international course on Records Theory (foundation, estimation, prediction and characterization). This international post-graduate course took place in the Gaston Berger University (UGB) from 20 to 31 March 2017, and was followed by participants from Senegal, Mali and Togo.

The invited speaker, Professor Mohammad Ahsanullah of Rider University (Lawrenceville, New Jersey, USA) was able to successfully cover all aspects of records theory and exposed some of its new research trends. He also left a series of open problems to be tackled by those in attendance.

It is expected that some members of the LERSTAD (both MS/PhD students and senior  members) will undertake research activities on the topic of the course. Workshops are also scheduled within Africa to share the results of this fantastic experience.

This course is part of an initiative launched to diversify and to empower the research activities at the local plan of UGB, and more generally in Africa, to set up a MS–PhD mentoring system, and to initiate co-authoring with well-confirmed scientists who wish to share with their African counterparts. A special effort is made to reach experts from North America, as in this case, with Prof Ahsanullah.

The course was organized by Prof Gane Samb Lo, under the supervision of the LERSTAD, Applied Mathematics Department and the Sciences Faculty. It was funded by the World Bank Excellence Center CEA-MITIC/UGB.

Future courses are planned within the scope described above. Contact Gane Samb Lo, President of the Statistics and Probability African Society (statpas.net), for further details: gane-samb.lo@ugb.edu.sn.

Some of the participants at the Records Theory course in Senegal in March

 

Categories: Math and Stats

Meeting report: 2017 Rao Prize Conference

IMS Bulletin - Mon, 2017-10-02 09:08

The Penn State Department of Statistics held the 2017 Rao Prize Conference on May 12, 2017, where three outstanding prize recipients were honored. The 2017 C. R. and Bhargavi Rao Prize recipient was Donald B. Rubin, the John L. Loeb Professor of Statistics at Harvard University. Paul R. Rosenbaum, the Robert G. Putzel Professor of Statistics at the Wharton School, was the 2017 C. G. Khatri Lecturer. The 2017 P. R. Krishnaiah Lecturer was Satish Iyengar, who is Professor of Statistics at the University of Pittsburgh.

The conference program consisted of these three plenary speakers, together with four invited speakers, and a poster presentation by graduate students. The four invited speakers were Samuel Kou (Professor of Statistics at Harvard University); Kari Lock Morgan (Assistant Professor of Statistics at Penn State); Joseph Schafer (Statistical Researcher at the US Census Bureau); and Dylan Small (the Class of 1965 Wharton Professor of Statistics at The Wharton School).

More information about the conference and the speakers (including videos of the talks) is online at http://stat.psu.edu/Events/2017-Rao-Prize.

The C. R. and Bhargavi Rao Prize was established to honor and recognize outstanding and influential innovations in the theory and practice of mathematical statistics, international leadership in directing statistics research, and pioneering contributions by a recognized leader in the field of statistics. The C. R. and Bhargavi Rao Prize is awarded in odd years. The 2019 Rao Prize Committee will accept nominations in 2018. The C. G. Khatri Memorial Lectureship and P. R. Krishnaiah Memorial Lectureship honor the memory of C. G. Khatri and P. R. Krishnaiah by inviting outstanding researchers in statistics to deliver lectures at Penn State. More details are on the web at http://stat.psu.edu/information/prizes-and-memorial-lectures.

Categories: Math and Stats

XL-Files: ISIPTA-ECSQARU, BFAS-SMPS & WHOA-PSI

IMS Bulletin - Mon, 2017-10-02 09:07
Contributing Editor Xiao-Li Meng has attended some great meetings, with exotic acronyms, over the summer. He writes:

No, these are not Chinglish or Pinyin. If you haven’t heard of ISIPTA-ECSQARU, then you are in the good company of 95–99% of statisticians and probabilists. I certainly hadn’t, until BFF4 (http://bulletin.imstat.org/2017/05/xl-files-bayesian-fiducial-and-frequentist-bff4ever/), thanks to Teddy Seidenfeld’s timely introduction. I could not have arranged a better sabbatical orientation than de-deaning in the paradise of Lugano and learning about the paradoxes of imprecise probability. The joint meeting of the 10th International Symposium on Imprecise Probability: Theories and Applications (ISIPTA) and the 14th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU), from July 10–14, provided me with the therapeutic freedom of not needing to figure out some smart phrases (for giving a talk) or smiling faces (of people I don’t recall). For the first time, I felt like a post-doc venturing into a neighboring field, and could say “Mom, I made it!” (Ages ago, when I told my mother that I had become an Assistant Professor, her response was, “So you didn’t make it to post-doc?” In Chinese, “post-doc” sounds like “beyond doctor” but “Assistant Professor” more like a TA.)

My glorious post-doctoral feeling was troubled by less sunny thoughts about the isolation of disciplines that share virtually the same goal. Such separation has led to much reinventing of wheels and repeating of mistakes, a clear waste of intellectual and other resources. But I do wonder if it also helped foster some very different perspectives that otherwise could not have survived. For example, as statisticians, our reactions to the Dempster–Shafer theory of belief function have largely been either, “What’s that?” or “Who believes in belief functions?” Yet in ISIPTA-ECSQARU, Dempster’s combining rule was invoked far more frequently than Bayes’ rule.

There is also a Belief Function and Applications Society (BFAS), and its 2018 conference will be jointly held with SMPS: Soft Methods in Probability and Statistics. Whereas the statistical community has generally shunned anything that is “softer” than probability, a recent anecdote reminded me that the “hardness” we statisticians have imposed on scientists may also be a contributing factor to the outcry of non-replicability. A student of mine asked an astrophysicist collaborator if he could provide some examples where he only has vague prior information, such as knowing a Poisson intensity is between 3 and 5, but nothing else. The astrophysicist was rather amused: “Some examples? Everything I do! It is you guys that always push us to make up an entire prior!” With the growing concerns of non-replicable studies, perhaps it is the time for us statisticians to soften our hearts towards these soft methods that are designed to reduce the amount of “made-ups”?

Incidentally, my heart was further softened by the panda, lynx, and elephant that appeared at the second Workshop on High-Order Asymptotics and Post-Selection Inference (WHOA-PSI) at Washington University in St. Louis, August 12–14. Rest assured, no animal or human being was harmed. Todd Kuffner, a multi-talented composer, singer, and thinker, had this cool idea of using stuffed animals for timing speakers. Panda’s charm reminds me that I still have five minutes to charm the audience (and thanks to Todd, it now charms me every day: see below). Lynx is known for its speed: I have one minute to run for my life. But when the elephant is in the room, no one would be paying any attention to me, even if I were showcasing deep learning via its ability to reject the Riemann hypothesis with a p<0.005.

To ensure enough audience for the 7:45am sessions, Todd had another creative idea: 15-minute morning entertainment starting at 7:30am. With over 7000 YouTube subscribers (https://www.youtube.com/user/toddmakesnoise), Todd had little trouble enhancing the morning coffee with original songs that are as beautiful as his mind. He also provided me with a 15-minutes-of-fame slot: “Laughing early with XL,” another first-time adventure for me. My opening act (“A professor was dreaming that he was teaching. He woke up. He was.”) apparently was not as pungent for statisticians as “Why are standard deviations always 6?” (http://bulletin.imstat.org/2013/02/the-xl-files-a-fundamental-link-between-statistics-and-humor). Nevertheless, I am ready to make n=2…

So what softened your heart (or woke you up) this summer?

 

Sunrise with Teddy’s paper; and sunset with Todd’s panda

Categories: Math and Stats

Vladimir Voevodsky, 1966-2017

AMS Feed - Sun, 2017-10-01 23:00

Vladimir Voevodsky, an exceptional mathematician with deep insight who received the Fields Medal in 2002 for his development of a homotopy theory for algebraic varieties and his formulation of motivic cohomology, died September 30 at the age of 51. His work proved the Milnor conjecture, which for decades had been the major unsolved problem in algebraic K-theory. Voevodsky was a professor at the Institute for Advanced Study. In 2009 he proved the Bloch-Kato conjectures and more recently was working on homotopy type theory and computer-assisted proof verification. Voevodsky grew up in the Soviet Union and came to the U.S. to do graduate work at Harvard University, receiving his PhD in 1992 under the direction of David Kazhdan. He held positions at the Institute for Advanced Study, Harvard, the Max Planck Institute for Mathematics, and Northwestern University, before becoming a professor at the Institute in 2002.

For more about Voevodsky's work, see this 2002 article in Notices by Eric M. Friedlander and Andrei Suslin, Background on 2002 Fields and Nevanlinna Awardees by Allyn Jackson, "Voevodsky’s Univalence Axiom in Homotopy Type Theory" by Steve Awodey, Álvaro Pelayo, and Michael A. Warren in Notices (2013), and his biography at the MacTutor History of Mathematics archive. In that biography is this passage from Voevodsky, which gives an overview of the work for which he won the Fields Medal:

We start with geometry, the category of topological spaces. We invent something about this geometrical world using our basically visual intuition. The notion of pieces comes exclusively from visual intuition. We somehow abstract it and re-write it in terms of category theory which provides this connecting language. And then we apply in a new situation, in this case in the situation of algebraic equations which is purely algebraic. So what we get is some fantastic way to translate geometric intuition into results about algebraic objects. And that is from my point of view the main fun of doing mathematics.

A page posted by the Institute for Advanced Study will have information about a gathering being planned to celebrate Voevodsky's life and legacy. (Photo: Andrea Kane/Institute for Advanced Study, Princeton, NJ USA.)

Categories: Math and Stats

AMS and MAA Announce AMS Acquisition of MAA Book Program

AMS Feed - Thu, 2017-09-28 23:00

The American Mathematical Society and the Mathematical Association of America announced September 29 an agreement for the AMS to acquire the MAA's book publishing program. The high-quality mathematics titles and textbooks developed and edited by MAA Press will now be published as an imprint of the AMS Book Program. This agreement includes all existing MAA Press books, as well as a renewable license to continue to publish new books under the MAA Press imprint. Read more about the agreement.

Categories: Math and Stats

Global Math Week

AMS Feed - Thu, 2017-09-28 23:00

October 10-17, 2017: The aim of the Global Math Project is to get hundreds of thousands of people all over the world doing math (exploding dots) together at the same time.

Categories: Math and Stats

JSM 2017 in Baltimore

IMS Bulletin - Wed, 2017-08-30 10:06

The 2017 Joint Statistical Meetings in Baltimore, Maryland, which included the IMS Annual Meeting, took place from July 29 to August 3. There were over 6,000 participants from 52 countries, and more than 600 sessions. Among the IMS program highlights were the three Wald Lectures given by Emmanuel Candès, and the Blackwell Lecture by Martin Wainwright—Xiao-Li Meng writes about how inspirational these lectures (among others) were, in the September Bulletin (here). There were also five Medallion lectures, from Edoardo Airoldi, Emery Brown, Subhashis Ghoshal, Mark Girolami and Judith Rousseau.

Next year’s IMS lectures

At the IMS Presidential Address and Awards session (you can read Jon Wellner’s address in the next issue), the IMS lecturers for 2018 were announced. The Wald lecturer will be Luc Devroye, the Le Cam lecturer will be Ruth Williams, the Neyman lecture will be given by Peter Bühlmann, and the Schramm lecture by Yuval Peres. The Medallion lecturers are: Jean Bertoin, Anthony Davison, Anna De Masi, Svante Janson, Davar Khoshnevisan, Thomas Mikosch, Sonia Petrone, Richard Samworth and Ming Yuan.

Next year’s JSM invited sessions

If you’re feeling inspired by what you heard at JSM, you can help to create the 2018 invited program for the meeting in Vancouver (July 28–August 2, 2018). Submit an invited session proposal at http://ww2.amstat.org/meetings/jsm/2018/submissions.cfm#invited. But hurry: the deadline is September 6, 2017.

See the PDF of the September 2017 Bulletin for photos from the conference (pages 4–7 and a few more on pages 12–13).

The JSM2017 program is online at http://ww2.amstat.org/meetings/jsm/2017/onlineprogram/index.cfm.

Categories: Math and Stats

Pages

Subscribe to AGS TechNet aggregator - Math and Stats