Pip, GLL, Theory In The Time of Big Data, here. Think about this in the context of Reverse Engineering the native-DNA code in the Biosphere given that CRISPR shows there is a push down stack in the native code. So perhaps you have 10**32 mega base pairs to test in the Biosphere and you want to reverse engineer the native-DNA code for something basic like cell division. The native code is the interesting part, right? Just figure something basic out without swinging for the fences. What is the incremental value of an additional new mega base pair when your objective is to reverse engineer cell division? Obviously you are not going to read them all to reverse the biosphere, there is not enough time. But that doesn’t mean the additional input is without value, right? You might use it. Think of it like reverse engineering the MCO given all the chess moves (and you don’t know upfront that they are chess moves), what is the incremental value of one more list of moves from a game? What if the list of moves comes from a Karpov game, any difference?
Think of the guys who reversed TOPS-10 off a PDP-10 they got on the black market in Moscow back in the day. What’s the value of a second PDP-10 to reversing TOPS-10? That is probably a big deal. What if they had a trillion more PDP-10’s, any difference in the time to back out TOPS-10? Yes, this is like that old Frankenstein movie when Dr. Frankenstein pulls out a cell phone to call town in 1850 Bavaria. I know. Brian Redman did a 32-bit Solaris port to 64 bit AIX for Morgan Stanley’s Swaps Trading System by writing all the variables in all the lines of code out to a file and then compared the files as he ported the code. Maybe you do something similar if you have a trillion PDP-10s and you just really really need TOPS-10 source?
Anna Gilbert and Atri Rudra are top theorists who are well known for their work in unraveling secrets of computation. They are experts on anything to do with coding theory—see this for a book draft by Atri with Venkatesan Guruswami and Madhu Sudan calledEssential Coding Theory. They also do great theory research involving not only linear algebra but also much non-linear algebra of continuous functions and approximative numerical methods.
Today we want to focus on a recent piece of research they have done that is different from their usual work: It contains no proofs, no conjectures, nor even any mathematical symbols.
Their new working paper is titled, “Teaching Theory in the time of Data Science/Big Data.” As you might guess it is about the role of theory in the education of computer scientists today. The paper contains much information that they have collected on what is being taught at some of the top departments in computer science, and how the current immense interest in Big Data is affecting classic theory courses.