Watch out, Big Data.

Normally it would be considered rude to call a class “bullshit,” but here’s one time you can get away with it.

Two University of Washington professors are teaching a course to help students “think critically about the data and models that constitute evidence in the social and natural sciences,” according to the introduction to the course.

The 160-seat seminar, titled “Calling Bullshit in the Age of Big Data,” begins in late March and continues for roughly 10 weeks. Members of the general public can follow the course syllabus, including readings and recordings of lectures, at the course’s website.

At the end of the course, students should be able to “provide your crystals-and-homeopathy aunt or casually racist uncle with an accessible and persuasive explanation of why a claim is bullshit,” according to the syllabus.

The syllabus went viral after it was posted last month, according to a Friday story in Stat News — the instructors’ email inboxes were overflowing, and some book offers were even made. The course reportedly filled all open seats within the first minute of online registration at UW.

Carl Bergstrom, a professor in the university’s biology department, along with UW Information School assistant professor Jevin West, got the idea for the course through conversations they had while corresponding about articles they were reviewing for journals.

West told Recode that he and Bergstrom started to notice a trend in the last few years: More bullshit in the articles they were reviewing. “We think science is, sort of, it’s … at risk a little bit,” West said.

One area of big problems: Big Data (one of the buzzwords of the century, which at its simplest refers to big sets of data, but has likely also been overhyped in its potential for revolution). He said he noticed methods of statistics meant for smaller data sets being applied to “big” data sets with millions or billions of examples, where it’s easy to force a correlation that isn’t necessarily accurate.

He also observed situations where machine-learning algorithms were “overfitting” data. Basically, you can have an algorithm that so specifically matches a particular data set, meaning it reflects even errors or noise, it fails when applied to another data set where you would otherwise expect it to work. You would normally want an algorithm that is sufficiently general to fit more than one data set.

In addition to Big Data and machine learning issues, the course addresses fake news.

Recode – All Go to Source
Author: Tess Townsend

Powered by WPeMatico