Friendica Social Network

3 Monate her

The full text of Simone Scardapane's book Alice's Adventures in a Differentiable Wonderland is available online for free. It's not available in print form because it's being written and this is actually a draft. But it looks like Volume 1 is pretty much done. It's about 260 pages. It introduces mathematical fundamentals and then explains automatic differentiation. From there it applies the concept to convolutional layers, graph layers, and transformer models. A volume 2 is planned with fine-tuning, density estimation, generative modeling, mixture-of-experts, early exits, self-supervised learning, debugging, and other topics.

"Looking at modern neural networks, their essential characteristic is being composed by differentiable blocks: for this reason, in this book I prefer the term differentiable models when feasible. Viewing neural networks as differentiable models leads directly to the wider topic of differentiable programming, an emerging discipline that blends computer science and optimization to study differentiable computer programs more broadly."

"As we travel through this land of differentiable models, we are also traveling through history: the basic concepts of numerical optimization of linear models by gradient descent (covered in Chapter 4) were known since at least the XIX century; so-called 'fully-connected networks' in the form we use later on can be dated back to the 1980s; convolutional models were known and used already at the end of the 90s. However, it took many decades to have sufficient data and power to realize how well they can perform given enough data and enough parameters."

"Gather round, friends: it's time for our beloved Alice's adventures in a differentiable wonderland!"

Alice's Adventures in a differentiable wonderland

#solidstatelife #aieducation #differentiation #neuralnetworks

Book: Alice’s Adventures in a differentiable wonderland

My personal website, where I collect slides, publications, and presentations.

^{Simone Scardapane}

3 Personen mögen das

Birne Helene hat dies geteilt

3 Kommentare - Zeige mehr

Will

2 Monate her

@natewaddoups Yes, so first, neural nets use weights between neurons (nodes) in successive layers to capture relationships. The weights are continuous and differentiable, and as they adjust the weights iteratively (back prop) they remain continuous. Second, as the input is processed through a series of layers, each layer acts like a high dimensional representation. For example, first few layers may classify input in terms of where it finds "edges" or "corners". So, you can say that the early stages "represent" edges.

In an LLM, the network seems to develop a representation of how similar words are in a high dimensional "semantic space" so to speak.

Thus for example you can use the weights at a given layer and can calculate the equation "king minus male plus female" to be equal to queen.

For a better explanation see these links i just grabbed. I haven't read them but they should answer your question.

https://kawine.github.io/blog/nlp/2019/06/21/word-analogies.html
https://www.technologyreview.com/2015/09/17/166211/king-man-woman-queen-the-marvelous-mathematics-of-computational-linguistics/

King – Man + Woman = Queen: The Marvelous Mathematics of Computational Linguistics

The ability to number-crunch vast amounts of words is creating a new science of linguistics.

^{Emerging Technology from the arXiv (MIT Technology Review)}

@natewaddoups

natewaddoups

2 Monate her

I'm familiar with how neural networks work, and with word embeddings.
I just don't know what you mean by "continuous representations." Or rather, I can think of a couple ways to interpret that term and I'm wondering what you had in mind. Especially since my best guess would mean that you must already understand that the answer to your question is "yes."

Will mag das.

This website uses cookies to recognize revisiting and logged in users. You accept the usage of these cookies by continue browsing this website.