0.0 Course directors

Saul Kato ([email protected]), Karunesh Ganguly ([email protected]), Reza Abbasi Asl ([email protected])

0.1 Course description

Many machine learning approaches can be thought of as a process of encoding high-dimensional data items into a low-dimensional space, then (optionally) decoding them back into a high-dimensional data space. This paradigm encompasses the endeavors of dimensionality reduction, feature learning, classification, and of particular recent excitement, generative models. It has even been proposed as a model of human cognition. This course will survey uses of encoder-decoder models in current neuroscience research. Lectures will be given by UCSF and other neuroscientists or machine learning practitioners.

0.2 Schedule

MWF 9-11am | Room MH-2106 MH-1406 in Mission Hall

First class: Monday April 22, 2024

Last class: Friday May 10, 2024

8 lectures from instructors and guests, plus student presentations at the last class.

Lecture schedule, subject to change:

https://docs.google.com/spreadsheets/d/1gOljBvrkZDBy9Y4Y71-KWCKBlcXBTZdL1B_BXctj9KA/edit#gid=0

Office hours are: Friday May 3 11am-1pm and Monday May 6 11am-1pm to help with projects**.   481C in the Weill Building.**

0.3 Project, group or individual

Proposal:

250-word maximum proposal (.pdf), due Friday at Apr 26 Sunday April 28 at midnight emailed to [email protected]. Figures optional. References optional but appreciated.

Be sure to answer:

(1) What data are you analyzing?

(2) What question(s) would you like to ask of your data?

(3) What model / algorithm will you try first?

(4) What "positive computational control" (labeled ground truth data and/or synthetic data) will you validate your model on?

(5) What performance metric(s) will you assess your model with?

Deliverables:

(1) .ipynb (Jupiter) notebook or git repo with a step-by-step readme due Thu May 9 at midnight, just a URL is fine. [we will try to run your code]

(2) 10 minute presentation on May 10 (on Google Slides, max 10 slides)

(3) course evaluation filed by end of class of May 10 (important!).

Rubric:

you will get a Pass if you

(1) have experimental/real-world data to analyze

(2) have created a positive control dataset (human-labeled ground truth data and/or synthetic data)

(3) have shown your model to give good results on your positive control data

(4) have measured the performance of your model on real-world (held out) probe data, or even better, do some jackknifing

you will get a Pass- if you:

do all of the above but do not properly hold out your probe data

you will get a Pass+ if you:

(1) use your model to "hallucinate" new data

(2a) show that your model outperforms baseline naive analysis (e.g. "guess the mean")

and/or

(2b) tweak your model at least once to improve model performance

Project tip: if you need to get up and running in a Python notebook with a minimum of fuss, this is a good way to do it: https://colab.research.google.com/

Office hours are: Friday May 3 11am-1pm and Monday May 6 11am-1pm to help with projects**.   481C in the Weill Building.**

0.4 Bibliography

Lecture 1: the encoder-decoder framework

Background reading

No BS Guide to Linear Algebra (2020, Savov)

Principal Component Analysis (2002, Jolliffe) [pdf for UC]

Pattern Recognition and Machine Learning (2006, Bishop) [pdf]

Generative Deep Learning, 2nd edition, O’Reilly Series (2023, Foster)

History of neural networks

McCulloch, Warren S.; Pitts, Walter (1943-12-01). "A logical calculus of the ideas immanent in nervous activity". The Bulletin of Mathematical Biophysics. 5 (4): 115–133. doi:10.1007/BF02478259

Rosenblatt, Frank (1958) "The perceptron: A probabilistic model for information storage and organization in the brain". Psychological Review. 65 (6): 386–408. doi:10.1037/h0042519

Minsky, Marvin; Papert, Seymour (1988) Perceptrons: An Introduction to Computational Geometry. MIT Press. The proof that 2-layer perceptrons couldn’t compute the XOR function triggered the “AI winter”.

Rumelhart D., Hinton G., Williams R. (1986) “Learning representations by back-propagating errors”. Nature. Showed how you could train a multi-layer neural network by gradient descent and applying the chain rule to compute d(cost function)/d(parameter) for all network parameters.

Universal function approximation

Palm G (1979) On the representation and approximation of nonlinear systems. Part lI: Discrete time. Biol Cybern 34:49-52

Kolmogoroff, A.N. (1957) “On the representation of continuous functions of several variables by superposition of continuous functions of one variable and addition” (Russian). Dokl. Akad. Nauk. SSSR 114:953-956; 1957; AMS Transl. 2:55-59; 1963.

Hornik K, Stinchombe M, White H. (1989) “Multilayer Feedforward Networks are Universal ApproximatorsNeural Networks, Vol. 2, pp. 35Y-366

VAEs

Kingma D, Welling M (2013) “Auto-encoding Variational Bayes”

2D embedding methods: t-SNE and UMAP

https://pair-code.github.io/understanding-umap/ awesome interactive comparison

Transformers

Vaswani, A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, and Polosukhin I (2017) “Attention is all you need.” Advances in Neural Information Processing Systems.

https://www.youtube.com/@statquest explainer youtube videos about neural networks

Lecture 2: demixed PCA, GPFA (Karunesh Ganguly)

Kobak,…, Machens (2015) “Demixed principal component analysis of neural population data”, Elife.

Lebedev… Nicolelis (2019) “Analysis of neuronal ensemble activity reveals the pitfalls and shortcomings of rotation dynamics”, Sci Reports.

Carryover discussion about performance metrics from first lecture:

  [<https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers>](<https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers>)

Questions: email saul.kato /at/ ucsf.edu