0.0 Course directors
Saul Kato ([email protected]), Karunesh Ganguly ([email protected]), Reza Abbasi Asl ([email protected])
0.1 Course description
Many machine learning approaches can be thought of as a process of encoding high-dimensional data items into a low-dimensional space, then (optionally) decoding them back into a high-dimensional data space. This paradigm encompasses the endeavors of dimensionality reduction, feature learning, classification, and of particular recent excitement, generative models. It has even been proposed as a model of human cognition. This course will survey uses of encoder-decoder models in current neuroscience research. Lectures will be given by UCSF and other neuroscientists or machine learning practitioners.
0.2 Schedule
MWF 9-11am | Room MH-2106 MH-1406 in Mission Hall
First class: Monday April 22, 2024
Last class: Friday May 10, 2024
8 lectures from instructors and guests, plus student presentations at the last class.
Lecture schedule, subject to change:
https://docs.google.com/spreadsheets/d/1gOljBvrkZDBy9Y4Y71-KWCKBlcXBTZdL1B_BXctj9KA/edit#gid=0
Office hours are: Friday May 3 11am-1pm and Monday May 6 11am-1pm to help with projects**. 481C in the Weill Building.**
0.3 Project, group or individual
Proposal:
250-word maximum proposal (.pdf), due Friday at Apr 26 Sunday April 28 at midnight emailed to [email protected]. Figures optional. References optional but appreciated.
Be sure to answer:
(1) What data are you analyzing?
(2) What question(s) would you like to ask of your data?
(3) What model / algorithm will you try first?
(4) What "positive computational control" (labeled ground truth data and/or synthetic data) will you validate your model on?
(5) What performance metric(s) will you assess your model with?
Deliverables:
(1) .ipynb (Jupiter) notebook or git repo with a step-by-step readme due Thu May 9 at midnight, just a URL is fine. [we will try to run your code]
(2) 10 minute presentation on May 10 (on Google Slides, max 10 slides)
(3) course evaluation filed by end of class of May 10 (important!).
Rubric:
you will get a Pass if you
(1) have experimental/real-world data to analyze
(2) have created a positive control dataset (human-labeled ground truth data and/or synthetic data)
(3) have shown your model to give good results on your positive control data
(4) have measured the performance of your model on real-world (held out) probe data, or even better, do some jackknifing
you will get a Pass- if you:
do all of the above but do not properly hold out your probe data
you will get a Pass+ if you:
(1) use your model to "hallucinate" new data
(2a) show that your model outperforms baseline naive analysis (e.g. "guess the mean")
and/or
(2b) tweak your model at least once to improve model performance
Project tip: if you need to get up and running in a Python notebook with a minimum of fuss, this is a good way to do it: https://colab.research.google.com/
Office hours are: Friday May 3 11am-1pm and Monday May 6 11am-1pm to help with projects**. 481C in the Weill Building.**
0.4 Bibliography
Lecture 1: the encoder-decoder framework
Background reading
No BS Guide to Linear Algebra (2020, Savov)
Principal Component Analysis (2002, Jolliffe) [pdf for UC]
Pattern Recognition and Machine Learning (2006, Bishop) [pdf]
Generative Deep Learning, 2nd edition, O’Reilly Series (2023, Foster)
History of neural networks
McCulloch, Warren S.; Pitts, Walter (1943-12-01). "A logical calculus of the ideas immanent in nervous activity". The Bulletin of Mathematical Biophysics. 5 (4): 115–133. doi:10.1007/BF02478259
Rosenblatt, Frank (1958) "The perceptron: A probabilistic model for information storage and organization in the brain". Psychological Review. 65 (6): 386–408. doi:10.1037/h0042519
Minsky, Marvin; Papert, Seymour (1988) Perceptrons: An Introduction to Computational Geometry. MIT Press. The proof that 2-layer perceptrons couldn’t compute the XOR function triggered the “AI winter”.
Rumelhart D., Hinton G., Williams R. (1986) “Learning representations by back-propagating errors”. Nature. Showed how you could train a multi-layer neural network by gradient descent and applying the chain rule to compute d(cost function)/d(parameter) for all network parameters.
Universal function approximation
Palm G (1979) On the representation and approximation of nonlinear systems. Part lI: Discrete time. Biol Cybern 34:49-52
Kolmogoroff, A.N. (1957) “On the representation of continuous functions of several variables by superposition of continuous functions of one variable and addition” (Russian). Dokl. Akad. Nauk. SSSR 114:953-956; 1957; AMS Transl. 2:55-59; 1963.
Hornik K, Stinchombe M, White H. (1989) “Multilayer Feedforward Networks are Universal Approximators” Neural Networks, Vol. 2, pp. 35Y-366
VAEs
Kingma D, Welling M (2013) “Auto-encoding Variational Bayes”
2D embedding methods: t-SNE and UMAP
https://pair-code.github.io/understanding-umap/ awesome interactive comparison
Transformers
Vaswani, A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, and Polosukhin I (2017) “Attention is all you need.” Advances in Neural Information Processing Systems.
https://www.youtube.com/@statquest explainer youtube videos about neural networks
Lecture 2: demixed PCA, GPFA (Karunesh Ganguly)
Kobak,…, Machens (2015) “Demixed principal component analysis of neural population data”, Elife.
Lebedev… Nicolelis (2019) “Analysis of neuronal ensemble activity reveals the pitfalls and shortcomings of rotation dynamics”, Sci Reports.
Carryover discussion about performance metrics from first lecture:
[<https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers>](<https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers>)
Questions: email saul.kato /at/ ucsf.edu